New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Right-to-Left (RTL) support for Hebrew and Arabic #219
Comments
I don't know much about RTL languages, but it seems to me that you could reverse the string and align to the right to get this working (I'm probably wrong here). However, if the text contains a combination of LTR and RTL text, then we'll need an implementation of the Unicode Bidi Algorithm. Those who know more than I do, please fill me in. I'd love to see this implemented so PDFKit is more widely usable. |
A separate issue is vertical text support (e.g. Japanese), which I'd also like to see and which has its own challenges. |
Arabic also has its own challenges because letters get a different shape depending on their position in a word (beginning, middle, end) so this is anything but easy :) |
Interesting, I assume there is some sort of algorithm out there to determine this? Starting to sound like a lot of work. |
I'm just parachuting in, but isn't this something like what you need: Why re-implement unicode algorithms? EDIT: Wait a moment, this seems to be quite far from what's needed, my bad. But isn't there a ready implementation? |
No, that's just unicode character metadata, not any actual algorithms. RTL support will require an implementation of the Unicode Bidi Algorithm. Shaping of Arabic text with contextual substitutions is a separate problem to solve. |
Yeah, this library from Twitter might work but I haven't tried it. |
I'll go ahead and fork |
I'd try Twitter's library and see if it produces the results you expect. Sorry for being so ignorant on this, but does it work to run the text through that library, then send the result to the PDFKit |
I found something that might be even more to the point: |
Yeah, the problem is that node-icu-bidi is a node C++ module, but PDFKit also works in the browser, so everything must be pure JavaScript. If it works for your needs, feel free to use it, but PDFKit won't take on a non-JS dependency. |
I understand, so an acceptable solution would be to extract the BIDI On Tue, Aug 5, 2014 at 8:00 PM, Devon Govett notifications@github.com
|
maybe i'm late to the party; just wanted to mention i've implemented (a looong ago) a similar solution in DOS (with the old fashioned 16x16 bitmap fonts); but i think the same approach can be applied here 1- reorder the input string using the Bidi algorithm 1 and 4 are the 'easy parts'; for 2 and 3 it's another story: for the OpenType fonts i think there is a GSUB table that can be used for this; but for other font types the only option i think is to implement the specific algorithm for each script (as you said this is a lot of work) |
it seems another solution to Arabic shaping is the use of 'Text based Shaping' that transforms the characters on the string level rather than in the Glyph level (further details are there). And it seems there is already an implementation of this kind in Javascript by the ibm-js team. From the sources it appears that the text engine performs a bunch of operations at the character level: 1- Bidi reordering This can be also a possible fallback to non OpenType fonts which doesn't have a GSUB table |
If you prioritize Bidi reordering, and symmetrical swapping, it's enough for Hebrew support. |
I found the following infos related to this topic. Python Arabic Reshaper is a library which can be used in cases when native Arabic support is not available. The readme contains a good explanation of the issue and the solution. This library has been ported to Javascript. On the BIDI topic I found this test program written in Javascript. |
There are GSUB (Glyph substitution) tables in font files for Complex languages. |
PDFKIT still has a problem with RTL |
Hi! @devongovett any update on RTL support? Question 2, is this project dead? |
Please don't be dead :( |
pdfkit has more functionality than jsPDF. jsPDF doesn't have full unicode support but pdfkit does. The project and its committers deserve the praise. For RTL, right-aligned text works very well. However, when we want to use columns, things change. The need is just to start from right-most column through the left most column. @devongovett we don't need anything except this I think because the RTL text has its RTL way, no need to reverse the strings. (same for LTR inside RTL) |
RTL is much more than right aligned text. There’s the issue of comma and
dot positions, and what happens when LTR stuff like numbers and English
text are mixed in a sentence.
|
This is exactly what I am looking for. Just a bit improvement: |
Just a note for whoever is still stuck on this that reversing the text is not a good idea. It will reverse things like numbers and various other things that should not be reversed. Use a library meant for this, like TwitterCldr, see #219 (comment) |
Note: We are reversing array of words, not array of characters!!! |
@weera-tech the actual letters need to be reversed too. Not just the word order is supposed to be reversed in rtl writing. |
You are right, but I said that first install suitable language package, in RTL direction, you have to set align to right. Therefore it will have conflict with TCLDR character ordering. simple: -1 * -1 = 1 :) |
I'm not sure what sort of mechanism would actually reverse characters for you, but not words, considering pdfkit has no rtl support whatsoever. Perhaps something weird is happening on Linux. I'm using pdfkit in the browser with webpack. In my experience, and I have a production app using this approach with TwitterCLDR and pdfkit, simply reversing words resulted in support tickets being issued for exactly this problem. Words where in the correct order, but letters were in the wrong order. |
Ooops!!! |
The only correct implementation will be the Unicode bidi algorithm. Anything else, especially reverse(), will be incorrect. |
There is a recent WASM build of the HarfBuzz engine which is a text shaping engine used by Firefox Chrome, and others. https://github.com/harfbuzz/harfbuzzjs It does support Unicode bidi algorithms among other things. I believe it could be integrated with pdfkit to solve RTL once and for all. There is a demo here: https://harfbuzz.github.io/harfbuzzjs/ Some discussion about it being used to solve RTL issues for Photopea, which is a very popular online image editor: harfbuzz/harfbuzzjs#10 Unfortunately I'm not familiar at all with pdfkit's text rendering, but perhaps someone could look into it. |
Hey, Any news with RTL support? |
@devongovett from my limited understanding of fontkit it seems that it does indeed support rtl. I found this site and I was able to see rtl text being rendered properly. Also from what I understand, pdfkit is based on fontkit so what is stopping this from working? |
@andreialecu because RTL support is more than glyph rendering The only proper way to render rtl language is
|
I too would love to have an RTL support (Hebrew). |
+1 for rtl support |
Think out of the box |
I was able to use Persian font like this, I used this link
in my case, I used a Persian font you can use the font you need |
How is this still not supported? |
Wow, 7 years and still no full RTL-support out of box?… |
So I tried pretty much everything but nothing works.
but it gets rendered like this:
The "solution":
will handle Hebrew but not combination of RTL and LTR (it's result with Any working suggestions? 🙏 |
How come this superior library isn't supporting RTL languages?!! |
For me I get all the arabic letters parsed correctly on { rtl: true }, but only the numbers are in reverse direction. So I wrote a function, pass the string into it before adding it to the text() function of PdfKit
|
@AmirABody Kinda wondering, why would say this is a superior library, then? |
It requires a higher level layout algorithm than what pdfkit offers, for example https://github.com/foliojs/textkit. React PDF uses it under the hood: https://github.com/diegomura/react-pdf. Not sure if it supports bidi yet but the architecture is there to support it. Personally I think pdfkit is too low level for advanced text layout, and that it belongs in a higher level library like React PDF or pdfmake, but I also don't work on pdfkit much anymore. |
still an issue 9 years later. |
to my understanding there are 2 challenges:
regarding point 1. which was discussed above please test something like var doc = new PDFDocument({}) additionally you can mix arabic and non arabic texts and it shall render correctly or am i wrong ? |
Please add Right-to-Left (RTL) support for languages like Hebrew and Arabic...
Something like:
The text was updated successfully, but these errors were encountered: