Unicode -- Uphill both ways (Ruby Programming pt. 7)

I found this cool article on Unicode (it gets UTF-16 wrong but that's ok). However I'm running into a large wall dealing with Unicode in my program. So I'll put it out there so a solution presents itself.

So far my program checks each line of the file to see if it's ASCII only text. If so it reverses it with Ruby's built-in reverse method.

If not what I want to do is to have it read each hex pair (or four-some) decide if it is below U+007F (inclusive) to treat it as plain ASCII and pass the character as one element to an array, if it's between U+0080 and U+FFFF then to take a two byte chunk and pass it as one element to an array. And finally if it is between U+010000 and U+10FFFF then to take a three byte chunk and pass it as one element to an array. Then to read the elements of the array First one In Last one Out (FILO), remove the end of line (/n) marker and put the elements into another array. Join that array add an end of line element and write it to the file.

So the first thing I need to do is find a way of reading the hexadecimal values of the characters. So after a lot of looking I found a hex editor plugin for Notepad++ and though it doesn't do exactly what I want I figure something out. The last character or the U+007F is 7F in the hex value of the file. Apparently Notepad++ hides the 00 of endian-ness. So that's the one I want to move as a one element to another array. And at least for now I can assume that every thing above 80 is a two-byte element, till I figure a way of reading the three-byte ones. It won't be perfect but if it works it will be a step.
Now to try it out.

Comments

What Medieval Economics can teach us about tariffs.

As a teen, I used to play Dungeons and Dragons (D&D) with my friends. This started an interest in the medieval period that led to me taking a medieval history class in college just to understand the period more. Over the years I've also read great books like " Dungeon, Fire and Sword " about the crusades (I recommend the book) and yet with all that knowledge it wasn't until recently that it occurred to me I had a completely wrong understanding of economics in the Medieval Period. "Viking helmets, sword and footwear" by eltpics is licensed under CC BY-NC 2.0 In my D&D games, players who are adventures battling monsters and creatures would need equipment and on the trips to town, they'd get resupplied with their adventuring necessities. I'd run these moments referencing my imagination of what it must have been and fantasy books I'd read. There be an inn with a raucous bar, a gruffly black-smith, if a city also a weapon and armor sm...

Testing with Cucumber, Sinatra and Capybara

Everything you need to know There are many elements you need to simultaneously learn to do effective testing of your code. Because some of these elements are very simple a lot of explanations just jump over what you need to know and give them up as obvious. Let’s start with a list of the things you need to learn: Gherkin (the language of Cucumber) ——> super easy Capybara (the DSL that controls the browser tests) Rspec (the DSL in which the actual pass/fail tests are written.) None of these are hard. But having to learn all at the same time can seem daunting. But it’s not! It’s easy peasy but takes time. :-/ It took me three days to get a handle on this. And I hope by reading this you’ll get a handle on it much much quicker. Let’s start with Cucumber first. Cucumber Five things you need to know about Cucumber: Cucumber tests are located on a features folder that have plain text files with a .feature extension and written in Gherkin . The .feature files contain t...

Teaching Programming to Children (pt. 3)

Learning modalities One of the most important things I've learned about teaching is the importance of modalities. Modalities describe the way one learns. I define modalities loosely here, so that when I taught English in Japan, one modality was grammar-learning-learning, another was conversation-driven-learning, interactive, solitary, repetitive, or generative. The trick was to be aware of one's own bias and to teach to as many modalities as possible (not necessarily in the same lesson but throughout the class). I for one am a very visual person. I aced geometry and had headaches with algebra. I can't memorize a math formula with ease but can at a glance figure out angles on parallelograms. I enjoy photography and can't keep a musical beat. Which explains why programming languages with highly equation-driven syntax look like gobbledygook to me, and why when I serenade people I do it John Cussack style -- with a boom-box. This is the coolest thing about Ruby, my ...

David Acevedo's Blog

Search This Blog