WTF SQL

Noodles

Queen of Ramen
Joined
Sep 20, 2018
Messages
458
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
So, this is mostly just venting a bit, involving work.

So, we have this website we use at work for On Call. It's being shut down, so we had a group build us a new, better website for On Call. That site works fine.

I work on our dashboard website for all of the sites across the country in my organization. We have a page that consolidated the On Call. I have worked off and on the last few weeks to rebuild this using the new On Call site.

My new set up, works fine.

We have like 200 locations in this system. They work fine on the development server.

Last night, we rolled the new system out to production. Went through, checked through every location. Everything works fine.

Except like 3 sites.

And after we spent hours putting with it, we absolutely could not figure out why. Thankfully, they are sort of minor sites, so we left them broken for now and it shouldn't be an issue.

So the system basically works as follows, from a coding flow.

A JavaScript function gets a variable for the location identifier, it's either an index number from the database or an alphabet based company code.

The page checks to see if it's numeric or alpha characters, if it's numeric, it loops up the alpha code from the database and returns it. If it's already alpha characters, it truncates it.

The system uses the alpha character code to pull the On Call data from the other website, then formats and spits out the JSON.

It works absolutely fine.

Except for like 3 sites.

I even replicated the page on a dummy page. If I card code the Alpha characters into the API call. It works.

If I run the SQL command in SQL Manager, it works.

If I go into the database and manually change the Alpha characters to something else, it works, though it returns "No Data" as expected instead of an error. If I replicate the Alpha code from another location, it pulls that location data and works.

If I card code in an if statement on the test page, that converts the numerical code to the alpha code, bypassing the SQL call, it works on the test page, but not the live page.

If I rebuild that location, sonit gets a fresh numerical index, it fails.

If I build a new location with fake data, or works, if I edit that location to match the bad location, it fails.

No matter what trouble shooting I tried, it always fails, when any numerical code tries to convert to this one of three particular alpha codes. We have 200 other sites that all flow and convert fine. There isn't anything special about the alpha codes either, they are just abbreviations for city and state for the most part.

Like I said, mostly, I am just venting about this bizzare as hell problem.
 

Cristiano

I AM BABY GROOT
Admin
Joined
Sep 19, 2018
Messages
2,031
SL Rez
2002
Joined SLU
Nov 2003
SLU Posts
35836
As someone who works extensively with JSON, I can tell you that while not as annoying as XML, it has its quirks too and is sensitive to structure that can easily be broken. Have you taken a look at the JSON being generated and tried it in a JSON viewer to make sure it's valid? Certain characters can break JSON. This site is excellent - you can paste in the JSON and switch to a JSON view to see it hierarchically and it will warn if it's not valid JSON.

 
  • 1Like
  • 1Thanks
Reactions: Noodles and Clara D.

Noodles

Queen of Ramen
Joined
Sep 20, 2018
Messages
458
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
Yeah that's something I have learned recently in learning how to worth with it. It can definitely be flakey and easily broken.
 

Free

Kamilah is stalking me
VVO Supporter 🍦🎈👾❤
Joined
Sep 22, 2018
Messages
4,793
Location
Underground in America
SL Rez
2008
Joined SLU
May 2009
SLU Posts
55565
No matter what trouble shooting I tried, it always fails, when any numerical code tries to convert to this one of three particular alpha codes. We have 200 other sites that all flow and convert fine. There isn't anything special about the alpha codes either, they are just abbreviations for city and state for the most part.
How is that code cast in the db? Is it an INT? Is the datatype the same on the incoming value from these "3 sites"?
 

Noodles

Queen of Ramen
Joined
Sep 20, 2018
Messages
458
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
How is that code cast in the db? Is it an INT? Is the datatype the same on the incoming value from these "3 sites"?
The base website is Cold Fusion so they don't really have INT or STRING types, the DB call uses a numerical value though, pulled from either the URL or passed by a bit of java that make a little pop up on a different page. It works in the other 100+ cases for the other sites the same way.
 

GoblinCampFollower

Well-known member
Joined
Sep 20, 2018
Messages
382
The base website is Cold Fusion so they don't really have INT or STRING types, the DB call uses a numerical value though, pulled from either the URL or passed by a bit of java that make a little pop up on a different page. It works in the other 100+ cases for the other sites the same way.
It sounds like somewhere, deep within that code there is a field that a user could enter as a string, but is actually expected to be something like a number or a date, so you have at least once incidence where a user entered garbage.

What might be related to this, is that I manage a process with thousands of lines of SQL code, and somewhere deep within it, I had something fail in dev because a bad piece of dev data had an invalid date or a character in what was supposed to be a number field. Taking it apart to try to find what the field was could take a LONG time. I just kind of prayed to Wotan that, that issue rarely occurs in dev and never never makes it over to prod, lol.
 

Soen Eber

Vatican mole
VVO Supporter 🍦🎈👾❤
Joined
Sep 20, 2018
Messages
379
Could it be expecting Latin-1 and getting a Unicode character?

I'm also suspecting data issues that should have tighter validation checks.
 

Noodles

Queen of Ramen
Joined
Sep 20, 2018
Messages
458
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
I figured it out.

It wasn't a problem in the SQL pull.

So after the JSON is pulled, it gets two pieces of data. A notes field, and the company ID number of the person on call.

It uses the ID to pull information about the employee from a company DB, name, phone numbers, etc.

One guy. ONE PERSON didn't have a cell phone listed on file. And this person was on call for all three of these sites. So the returned value was blank. Not a real problem except that for readability, the output inserts some dashes into the phone numbers at certain points. Because this person didn't have a number, the code just craps out when trying to put a - at location 3 and 6 in the phone number variable.

So I threw in some checks on those lines and everything works fine now.
 

Soen Eber

Vatican mole
VVO Supporter 🍦🎈👾❤
Joined
Sep 20, 2018
Messages
379
When I was working as a report writer I usually tried to include a data validation block right after a read. I was working with HR information which is ... notoriously sloppy.
 

Kara Spengler

Queer OccupyE9 Sluni-Goon
Joined
Sep 20, 2018
Messages
2,659
Location
SL: November RL: DC
SL Rez
2007
Joined SLU
December, 2008
SLU Posts
23289
I am constantly finding strange things in databases and tables that someone else set up which will lead to strange errors at some point. Always using floats .... including for integer math. Using characters that are populated with numeric data (which they then proceed to use math on). Assuming null and 0 and blank are all the same. Differences in how development and production servers are set up, a common one being case insensitive/sensitive. Then optimistic me suggests fixing these things and winds up finding why nobody tilted at these windmills with the user community before.
 
  • 1Hug
Reactions: Brenda Archer

Noodles

Queen of Ramen
Joined
Sep 20, 2018
Messages
458
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
I am constantly finding strange things in databases and tables that someone else set up which will lead to strange errors at some point. Always using floats .... including for integer math. Using characters that are populated with numeric data (which they then proceed to use math on). Assuming null and 0 and blank are all the same. Differences in how development and production servers are set up, a common one being case insensitive/sensitive. Then optimistic me suggests fixing these things and winds up finding why nobody tilted at these windmills with the user community before.
One legit SQL problem I had a while back was running out of rows. There is a system (built by someone else) that people use to upload and post files with. we didn't use it a ton but it was used pretty frequently. Well, but to a few years ago, we got this new piece of test gear, and part of using it, monthly, was posting the results to online file system. This is, roughly, 20 files per site, across ~60 sites. So around 1200 files per month.

Suddenly one day, files weren't posting. The index had been set to Small Int instead of Int, which tops out at something like 32,000 or so. It was an easy fix, but an unexpected problem.

Most of my other "problems" come from it being MS SQL, because 99% of the online code you fine references MySQL. The commands are usually pretty close but not always identical.
 
  • 1Thanks
Reactions: Brenda Archer

Bartholomew Gallacher

Well-known member
Joined
Sep 26, 2018
Messages
1,250
SL Rez
2002
If you might think that development at Oracle is different - time to think again!

Oracle Database 12.2.
It is close to 25 million lines of C code.
What an unimaginable horror! You can't change a single line of code in the product without breaking 1000s of existing tests. Generations of programmers have worked on that code under difficult deadlines and filled the code with all kinds of crap.
Very complex pieces of logic, memory management, context switching, etc. are all held together with thousands of flags. The whole code is ridden with mysterious macros that one cannot decipher without picking a notebook and expanding relevant pats of the macros by hand. It can take a day to two days to really understand what a macro does.
Sometimes one needs to understand the values and the effects of 20 different flag to predict how the code would behave in different situations. Sometimes 100s too! I am not exaggerating.
The only reason why this product is still surviving and still works is due to literally millions of tests!
Here is how the life of an Oracle Database developer is:
- Start working on a new bug.
- Spend two weeks trying to understand the 20 different flags that interact in mysterious ways to cause this bag.
- Add one more flag to handle the new special scenario. Add a few more lines of code that checks this flag and works around the problematic situation and avoids the bug.
- Submit the changes to a test farm consisting of about 100 to 200 servers that would compile the code, build a new Oracle DB, and run the millions of tests in a distributed fashion.
- Go home. Come the next day and work on something else. The tests can take 20 hours to 30 hours to complete.
- Go home. Come the next day and check your farm test results. On a good day, there would be about 100 failing tests. On a bad day, there would be about 1000 failing tests. Pick some of these tests randomly and try to understand what went wrong with your assumptions. Maybe there are some 10 more flags to consider to truly understand the nature of the bug.
- Add a few more flags in an attempt to fix the issue. Submit the changes again for testing. Wait another 20 to 30 hours.
- Rinse and repeat for another two weeks until you get the mysterious incantation of the combination of flags right.
- Finally one fine day you would succeed with 0 tests failing.
- Add a hundred more tests for your new change to ensure that the next developer who has the misfortune of touching this new piece of code never ends up breaking your fix.
- Submit the work for one final round of testing. Then submit it for review. The review itself may take another 2 weeks to 2 months. So now move on to the next bug to work on.
- After 2 weeks to 2 months, when everything is complete, the code would be finally merged into the main branch.
The above is a non-exaggerated description of the life of a programmer in Oracle fixing a bug. Now imagine what horror it is going to be to develop a new feature. It takes 6 months to a year (sometimes two years!) to develop a single small feature (say something like adding a new mode of authentication like support for AD authentication).
The fact that this product even works is nothing short of a miracle!
I don't work for Oracle anymore. Will never work for Oracle again!


 
  • 1Thanks
Reactions: Brenda Archer

Kara Spengler

Queer OccupyE9 Sluni-Goon
Joined
Sep 20, 2018
Messages
2,659
Location
SL: November RL: DC
SL Rez
2007
Joined SLU
December, 2008
SLU Posts
23289
One legit SQL problem I had a while back was running out of rows. There is a system (built by someone else) that people use to upload and post files with. we didn't use it a ton but it was used pretty frequently. Well, but to a few years ago, we got this new piece of test gear, and part of using it, monthly, was posting the results to online file system. This is, roughly, 20 files per site, across ~60 sites. So around 1200 files per month.

Suddenly one day, files weren't posting. The index had been set to Small Int instead of Int, which tops out at something like 32,000 or so. It was an easy fix, but an unexpected problem.

Most of my other "problems" come from it being MS SQL, because 99% of the online code you fine references MySQL. The commands are usually pretty close but not always identical.
Ugh, small ints should ONLY be used on things you know where you only need a few values, like where a boolean would do 99.999999999% of the time but you need a 3rd state in rare instances. If you are not willing to stake your life on ever needing more values use something bigger.

Try converting mysql to mssql. I just did a whole chunk of that and it is not fun. Of course this was dropped in as a 'one more thing' issue right at the end when discussing a conversion. Some code is just a routine change you find out eventually (oh, it is schema.schema.table not schema.table .... sometimes) but others require completely rewriting things.
 

Kara Spengler

Queer OccupyE9 Sluni-Goon
Joined
Sep 20, 2018
Messages
2,659
Location
SL: November RL: DC
SL Rez
2007
Joined SLU
December, 2008
SLU Posts
23289
It was not my idea. You can also tell nobody thought it was going to be a fun road to follow by it being saved for when people were getting up to leave the requirements meeting.