I am currently in the middle of writing a Linq to Amazon SimpleDB provider since I have yet to really dive deep into Linq and I wanted to figure out exactly how it all works. I have already made some decent headway into the query part of it (I have yet to deal with inserts, updates, or deletes) but I am running into some issues with the way that Amazon SimpleDB handles numbers. According to Amazon all numbers must be zero padded to the length of the maximum number size that you will use. So, if I wanted to use decimals then I would need to pad to 29 digits!
The issue here is that Amazon treats all data as text, so in order to get proper comparisons we have to pad it all out. So, if I compare ‘6’ and ’12’ then ‘6’ will come out as larger since ‘6’ is larger than ‘1’. But if I pad ‘6’ out as ’06’ then ’12’ will come out larger since we will now compare ‘0’ to ‘1’ and ’12’ will win out.
On top of that, you also shouldn’t use negative numbers with Amazon SimpleDB since this will also keep you from doing proper comparisons. So, if you are going to use negative numbers, then you need to add the opposite of the smallest negative number that you are going to use. So, if the smallest number I am going to use is ‘-500’, then I can add ‘500’ to this and ‘-500’ becomes ‘0’ and ‘0’ becomes ‘500’. This way all of our comparisons in the database still come out correct, and when we pull the data out of the db we just subtract this number from our value and we end up with our proper number. A bit of a pain if you ask me, but still doable nonetheless.
The problem I am running into is that I don’t know how large or small any of these fields are going to be. And since each domain (basically a table) in SimpleDB can hold any number of attributes and different items can have different attributes, then we don’t know what attributes are available for a domain. If we don’t know what attributes are available then we can’t really ask the user for values for them. Well, we can get values for them, but we have no real way to check them. This would have to be configured in some way for each application, and if you got these values wrong, then you could really shoot yourself in the foot later on if you needed a larger (or smaller) number.
So, what would you do? I am leaning toward going the simple route and just making every number in the db large enough to hold the maximum and minimum Decimal type value. I don’t really know what the performance implications will be on the Amazon side, but I will see what I can find out. Most likely I will just need to run some tests and see if I can get some numbers on this.
Also, SimpleDB is still in limited beta, and so I am writing the provider using the query specs provided by Amazon. I also have access to the WSDL for the web service so I can see what the calls are going to look like. I do not have an account to test any of this though, so if anyone out there knows someone who can get me hooked up with an account on SimpleDB please let me know!
Loved the article? Hated it? Didn’t even read it?
We’d love to hear from you.