Skip to main content

Unicode -- Uphill both ways (Ruby Programming pt. 7)

I found this cool article on Unicode (it gets UTF-16 wrong but that's ok). However I'm running into a large wall dealing with Unicode in my program. So I'll put it out there so a solution presents itself.

So far my program checks each line of the file to see if it's ASCII only text. If so it reverses it with Ruby's built-in reverse method.

If not what I want to do is to have it read each hex pair (or four-some) decide if it is below U+007F (inclusive) to treat it as plain ASCII and pass the character as one element to an array, if it's between U+0080 and U+FFFF then to take a two byte chunk and pass it as one element to an array. And finally if it is between U+010000 and U+10FFFF then to take a three byte chunk and pass it as one element to an array. Then to read the elements of the array First one In Last one Out (FILO), remove the end of line (/n) marker and put the elements into another array. Join that array add an end of line element and write it to the file.

So the first thing I need to do is find a way of reading the hexadecimal values of the characters. So after a lot of looking I found a hex editor plugin for Notepad++ and though it doesn't do exactly what I want I figure something out. The last character or the U+007F is 7F in the hex value of the file. Apparently Notepad++ hides the 00 of endian-ness. So that's the one I want to move as a one element to another array. And at least for now I can assume that every thing above 80 is a two-byte element, till I figure a way of reading the three-byte ones. It won't be perfect but if it works it will be a step.
Now to try it out.

Comments

Popular posts from this blog

Building my own home.

I've decided. I want to build my own home. There is something special about building your own things. I built a desk for my tiny room when I first moved to L.A. My room was so small that I had to sit on the bed to use the computer so I build a high desk so I could sit on the bed and work on the computer. My roommate Trentity helped me cut the ply-wood to the right side. I still have that desk. It now sits on the living room covered by a cloth hiding the surplus of costume parts my current roommate Sean uses in his creations. Learning to build and fix things continue. And the feeling of satisfaction from fixing even small things is great. So a few years ago I heard on the NPR program the Story about a couple of educators that moved to a tent in their back-yard so they could rent their house and afford to send their kids to college. They had a special type of tent called a yurt and cooked and showered in an RV they had parked next to it. I thought I could do that. Housing in Lo

Contrasting Styles of Writing: English vs. Spanish

There is interestingly enough a big difference between what's considered good writing in Spanish and English . V.S. Naipul winner of the 2001 Nobel prize for literature publish an article on writing . In it he emphasizes the use of short clear sentences and encourages the lack of adjectives and adverbs. Essentially he pushes the writer to abandon florid language and master spartan communication . This is a desired feature of English prose , where short clipped sentences are the norm and seamlessly flow into a paragraph. In English prose the paragraph is the unit the writer cares about the most. This is not the case in Spanish where whole short stories (I'm thinking this was Gabriel Garcia Marquez but maybe it was Cortázar) are written in one sentence. Something so difficult to do in English that the expert translator could best manage to encapsulate the tale in two sentences. The florid language is what is considered good writing in Spanish but unfortunately this has lead t

My Fake Resume

Inspired by the over aggrandized bio of Joseph Rakofsky I want to write my own. If you don't know who he is; Joseph Rakofsky is a lawyer who earned a mistrial for a criminal client due to his (alleged) incompetence as reported on the Washington Post . There has been quite a few commentaries on his "Streisand-house" approach of suing all the bloggers and even the Washington Post and American Bar Association for reporting his (alleged) ineptitude. ("Streisand-house" is what happened to Barbara Streisand who wanted to have a picture of her mansion removed from the internet and she sued to have it removed. Unfortunately suing requires the filing of public documents with a picture of her house. The lawsuit had the direct opposite effect it intended. Everybody now could see legally, since it was a public document, a picture of her house.) But all that internet gossip aside I'm most impressed by his resume. Here is a quote from the website: Prior to stud