Basic tech stuff

Programming and Linux administration

Unix: select vs poll

Posted by Daniel Brahneborg on 2012 January 23

Asch, det här får bli på svenska för ovanlighetens skull.

Så länge jag kan minnas så har jag använt funktionen select() för att vänta på data på en socket. Koden för att använda den ser ut ungefär så här:

fd_set fdset;
struct timeval tv;

FD_ZERO(&fdset);
FD_SET(fd, &fdset);

timeout.tv_sec = 60;
timeout.tv_usec = 0;

switch(select(fd + 1, &fdset, NULL, NULL, &timeout)) {
case -1: ;/* something went wrong, check errno */
case 1: ; /* read data */
default: ; /* timeout */
}

Hos en kund fick vi plötsligt bara timeouter. Trots ett fint select-anrop med rätt parametrar, så sa den aldrig till när det fanns data att läsa, eller när det var ok att skicka tillbaka ett svar.

Problemet visade sig vara den där fd_set-saken. I man-sidan till select() står nämligen följande:

The behavior of these macros is undefined if the fd argument is less than 0 or greater
than or equal to FD_SETSIZE, …

Så vad är värdet på FD_SETSIZE? Jo, både i Linux och Solaris är det 1024. I de logfiler vi undersökte så var fd mellan 1025 och 1028, eftersom programmet hade ganska många aktiva uppkopplingar igång. Hoppsan.

Alltså var select() oanvändbar för vår del, så vi fick byta till funktionen poll(), som funkar lite annorlunda. Istället för att ta en bitmap där man har markerat vilka filer man vill vänta på, så skickar man in en array med filnummer. Filnumret blir irrelevant och kan vara hur stort som helst, och det tar inte längre tid oavsett hur stort det än är. Stackars select() måste ju gå igenom hela bitmappen för att hitta vilka bittar man har satt, vilket tar lite tid.

Det här är för övrigt precis det byte av perspektiv för att få bättre Ordo-värde som jag skrev om i höstens nanowrimo, för er typ 2-3 personer som har läst den. 🙂

Koden blir lite annorlunda, även om principen är densamma.

struct pollfd pfd;
pfd.fd = fd;
pfd.events = POLLIN;
rc = poll(&pfd, 1, 60 * 1000);
if ((rc < 0) || (pfd.revents & (POLLERR | POLLNVAL | POLLHUP))) { } /* fail */
else if (pfd.revents & POLLIN) { } /* read data */
else { } /* timeout */

Refaktorerat till en “finns det data att läsa på socket ‘fd’ inom x sekunder?”-funktion blev den nya koden riktigt smidig. Förut fanns en handfull olika anrop till select() som alla gjorde lite, lite olika när det gällde felhanteringen. Att flytta in det hela i en separat funktion gör att det är lätt att justera parametrarna till poll() om det skulle behövas, att det är lätt att logga vartenda anrop, och att samma typ av fel alltid rapporteras uppåt på exakt samma sätt.

Advertisements

Posted in programming | Leave a Comment »

JNI

Posted by Daniel Brahneborg on 2008 January 30

Jay me!

By using the example from the JNI documentation, fixing the errors, and then figuring out the not entirely obvious command lines for compiling, linking and running it, I now have a C program that can talk to Java.

On my Fedora Core 3 box I had to build and run it like this:

gcc -o runjava runjava.c -L/usr/java/jdk1.5.0_11/jre/lib/i386/client -ljvm
LD_LIBRARY_PATH=/usr/java/jdk1.5.0_11/jre/lib/i386/client ./runjava

I bet there is some nice shortcut for that, like `javaconfig -cflags` or something, but I haven’t found that one yet.

Using JNI doesn’t seem entirely difficult, especially since we’re only going to use very few arguments, and the same method signature for all calls. I hope our customers will like that they don’t have to write plugins in C anymore, but can use nicer languages.

Andra bloggar om: , , .

Posted in programming | Leave a Comment »

Never fetch data from views

Posted by Daniel Brahneborg on 2007 August 22

Jeez, when am I going to learn?

I know very well that all data that should be displayed in view must be fetched by the controller.

View must never, ever, under any circumstances, fetch data by themselves.

Still, I manage to break that rule all too often. “It’s just going to fetch THIS little piece of data”. This not only includes SQL statements, but even all kinds of accessors that do more than fetch complete objects from a hashtable or similar.

A while ago I wrote a little find_safely() helper method for my ApplicationController, that handles “Record not found” errors. All find() accesses went through that method, which meant that when the database had to be separated into different systems, which meant that most tables had to have a system_id column, only that single method had to be changed. Yes, the value had to be set in a few places, but it never fetched data for the wrong system.

Unfortunately some views fetched data themselves, which meant that I had to have the system_id check in both ApplicationController and ApplicationHelper. This quickly got out of hand, so it didn’t take long until I moved all those accesses back to the proper controller. When this was done, it was also possible to test that the views were showing the correct data. The normal Rails test harness doesn’t have access to the data that the views fetch, which means it can’t be tested. After the refactoring, it could.

This brings back another of my favourite arguments:

You should write tests, not only to verify that the application does what it is supposed to (you’ll find that out soon enough anyway), but because it forces you to build the application in a better way.

Anyway, the “data from views” problem came back to bite me once again today. The application stores a list of the most recently accessed business objects of each type. This was tested by directly checking the contents of the array where they were stored.

It worked fine, until the objects grew too large, overflowing the “data” column in the sessions table, which in turned caused a “marshal data too short” error too appear in the Mongrel log file. Most pages recommended using a larger column type such as mediumtext, but this is of course a bad idea. Having the poor application read and parse several hundred KB or more on each request, only to serialize it back and update the database afterwards kills performance. A couple of KB is ok, but the 64KB limit of the “text” type is there for a reason.

The proper solution is of course to only store the object id’s, and instanciate the objects only when needed. The application worked again, but now the hard coded test cases started to break. To get them to work, the controllers had to fetch this data, which in turn meant that the views didn’t have to.

Instead of an ugly Module that was imported by both ApplicationController and ApplicationHelper, the “latest objects” code could be merged back into ApplicationController where it belong. The code became simpler, the response time decreased, the database errors disappeared, and a larger part of the code got proper unit tests. Sure, just using “mediumtext” would have been faster in terms of work hours, but the code quality would have suffered.

Andra bloggar om: , , .

Posted in rails | 2 Comments »

Basic’s programming laws

Posted by Daniel Brahneborg on 2007 August 1

I’ve come to realize that the following laws apply when debugging a program.

  1. A program that runs on several platforms will fail on the one with the worst tools for debugging.
    1. Having gdb installed on one machine while being absent on another, makes the bug appear on the second machine.
    2. Having Purify or Valgrind installed on a subset of the platforms, guarantees that the bugs will occur only on the ones without these tools.
  2. With identical toolsets, the bugs will occur on the machine that is hardest to access.
    1. With ssh access to two machines, the bug will occur on the one to which you can’t use keybased login.
    2. If one machine requires Cisco VPN login, which never seems to work correctly, that’s the machine which will fail.
  3. If the program is multithreaded, the bug will only appear when running outside of the debugger, since these tools always affect the relative timing of the threads.

On the other hand, the reason I’m a programmer is not because it’s easy, but because it’s hard.

Andra bloggar om: .

Posted in programming | Leave a Comment »

Rails fix for nested forms

Posted by Daniel Brahneborg on 2007 May 26

In Rails there is a nice helper function button_to. It uses Javascript in the onclick event to create a new form which is immediately submitted. Perfect when a button is all you really need.

Or so I thought, until I installed the HTML Validator extension to Firefox. It has a super strict HTML parser that tells you when something is wrong with your code. This way there is a much greater chance that the pages will look the way you want in as many browsers as possible.

So what’s the problem? Well, I had a form surrounding a table. The form was really only for the last row in the table, but since a form must either be outside of the table tag or within a td, the only option was to let the form surround the table. On all rows but the one with the real form fields, I wanted that new button. When the new form was created, the HTML validator became a bit upset about the fact that a new form was created within an existing form. This isn’t allowed. (There are sure a lot of things that aren’t allowed in HTML.)

The solution was to patch actionpack/lib/action_view/helpers/url_helper.rb around line 367, changing

    "this.body.appendChild(f); f.method = 'POST'; f.action = this.href;"

into

    "document.body.appendChild(f); f.method = 'POST'; f.action = this.href;"

This makes the new form live as a child to the top level body tag instead. Now the validator was happy again.

Posted in programming, rails | Leave a Comment »

Reasons for automatic tests

Posted by Daniel Brahneborg on 2007 May 22

As everybody should know by now, there are plenty of reasons for writing automatic tests. One of them, of course, being the fact that you get to see whether your application actually works, at least for some cases. That’s just a small part of the story, though.

Reason number two is design. To be able to write both small unit tests, large system tests and everything in between, the application simply must be well designed. Otherwise it will be impossible to test one thing at a time, and mock out the parts that should be faked. Knowing where to draw the lines between the modules in a system can be difficult, but by simply trying to write nice tests for them, it becomes much easier.

The third reason is refactoring, which I personally got bitten by this weekend when making a change to my RSS/Ping service, written in Ruby on Rails. In the first versions the “ping” logic was a separate program, but the users wanted it merged with the web application. So, I very carefully moved one function at a time into the classes where they belonged, and made a little button in the web interface. Despite being exactly the same code, with no uninitialized variables or anything like that, it simply refused to work.

Today I found the problem. The standalone program used a couple of require statements to import modules for RSS and Atom parsing and whatnot, things that weren’t used in the web application. This made the function fail, even though everything was technically fully correct. The standalone program worked, so tests wasn’t really necessary, I thought.

Lesson learned? Automatic tests are good, and should aim for full code coverage. This makes the programming part so much easier.

Andra bloggar om: , , .

Posted in programming, rails, testing | Leave a Comment »

Rails: belongs_to :polymorphic and inheritance

Posted by Daniel Brahneborg on 2007 May 10

The flag :polymorphic option for belongs_to associations is extremely useful, especially when implementing an authorization system. Let’s say that I have a Permission class that points to either a House or a Car:

class Permission < ActiveRecord::Base; belongs_to :authobject, :polymorphic => true; end
class House < ActiveRecord::Base; has_many :permissions, :as => 'authobject'; end
class Car < ActiveRecord::Base; has_many :permissions, :as => 'authobject'; end

This works just fine, making it possible both to find the object that a Permission is for, and the relevant Permissions for a certain object. In the column authobject_type you get the strings “House” and “Car”, respectively.

Now the problem: We want to replace the Car class with an abstract Vehicle class, with subclasses Car and Boat:

class Vehicle < ActiveRecord::Base; has_many :permissions, :as => 'authobject'; end
class Car < Vehicle; end
class Boat < Vehicle; end

In Rails version 1.2.3 this puts the string “Vehicle” in the Permissions.authobject_type column, which causes lots of stuff to fail. In my case, a bunch of test cases that simply created a Permission and made sure it could be found again. Suddenly it couldn’t.

The problem was this bug: http://dev.rubyonrails.org/ticket/6485, with a patch that makes sure that the real class name in the authobject_type field. The Rails documentation says you should store the base class in the authobject_type field, but that simply isn’t right, since it makes it impossible to load the right subclass. Since we still want to store permissions to Houses, it’s important that we store the exact class name.

Edit 2007-05-16: The problem is still not completely solved, since :dependent => :destroy still uses the base class name instead of the correct one. Since following the relationship works, you have to loop and destroy the objects manually.

Andra bloggar om: , , .

Posted in programming, rails | Leave a Comment »

This is NOT the HD-DVD key

Posted by Daniel Brahneborg on 2007 May 2

Just as normal DVD’s are encrypted, so are HD-DVD’s. In the latter case both a global key and a bunch of player specific keys are used. First the latter ones were found for a particular player, and a while ago the global key was found. Using this, and some software, any HD-DVD can be played and copied to a hard drive for later viewing.

Today Slashdot has an article about the censoring of this number. Writing it seems illegal, but I ought to be able to write that it is NOT f6 06 ee fd 62 8b 1c a4 27 be a9 3a 9c a9 77 3f. (Thanks to AJVM for the tip.)

Damn, the movie industry is stupid.

Andra bloggar om: , .

Posted in encryption | Leave a Comment »

The Ruby module Enumeration is fun

Posted by Daniel Brahneborg on 2007 April 21

In a Ruby application I had a bunch of data in a hash table that I wanted to print in a special format. There were a couple of requirements:

  1. Some of the options should be ignored.
  2. The format should be “key:value”.
  3. The list should be sorted.
  4. The entries should be separated by a space.

Fortunately this sort of thing is dead easy in Ruby, because of all of the cool functions in the Enumeration module. By stacking them after each other, I got this:

KILL_LIST = [ 1, 42, 312 ]
def hash_for_print(hash_data)
  hash_data.
    reject {|key, value| KILL_LIST.include?(key)}.
    collect {|key,value| "%03d:%s" % [key, value]}.
    sort.
    join(" ")
end

No, it’s not something I’d use in the innermost loop of a realtime system. That’s irrelevant. The code is dead easy to understand and modify, and didn’t take much time to write. Besides, it scales linearly, which is quite important.

Andra bloggar om: , .

Posted in programming, ruby | Leave a Comment »

32 bit IP address to dotted notation in Ruby

Posted by Daniel Brahneborg on 2007 April 13

In one of our applications we store an IP address as a 32 bit integer. To show the value of this field it must be converted to normal dotted notation, and then back again to an integer to get stored in the database.

Going from dotted notation is easy:

require 'ipaddr'
IPAddr.new('1.2.3.4').to_i

Or, the “manual” version:

'1.2.3.4'.split('.').inject(0) {|total,value| (total << 8 ) + value.to_i}

I couldn’t find any examples of going from an integer to dotted notation, so I ended up with this:

address = 0x01020304
[24, 16, 8, 0].collect {|b| (address >> b) & 255}.join('.')

Andra bloggar om: , .

Posted in programming, ruby | 7 Comments »