Welcome to Working With Rails

 

Discussion Forums

Discuss all things Ruby on Rails with perhaps the web's most vibrant group of Ruby on Rails enthusiasts.
Best way to structure a database for a large/static dataset
3 Posts
Best way to structure a database for a large/static dataset

Best way to structure a database for a large/static dataset

There can be a universal model, each record.

for e.g:- suppose u have a answer model,it greatly depends on the needs of application

@answers = Answer.find(...) @answers.text_answers //for text answers @answers.boolean_answers //for boolean answers

Or may be u can have like this:-

@answers = Answer.find(...) @answers.each do |a|
if a.textanswer?

 ...

elsif a.booleananswer?

 ...

end end

So it purely depends on the need and the semantics u would like to use in database.

Off the top of my head, I think your idea of having a big table with all the responses is on the right track. To reduce some of the redundancy of saying that person 123 took the survey in 1989 over and over for each response on it, you could have a table of, say, Takings, that would represent all the data about a given person having taken the survey. You may have many other such opportunities to extract things out of that table and into a much smaller one.

As for the variability of the answers, you can have a type column that will tell the app which of several supplementary tables to look in -- one for booleans, one for strings, one for dates, etc. Yes this introduces some complexity and inefficiency of operation, but at least the data will be reasonably normalized and efficient to store. Your Response class could have a method like "answer" that would check which table the actual answer is in, then go retrieve it.

Person
Forum Rating: 100
Sign In To Rate Post

I'm a rails newbie and am looking to create an app to work off a very large dataset. The dataset contains survey data from 1980 through 2011 for about 8,000 respondents. Each year's survey has between 900 and 7000 variable associated with it.

Basically, the app will be used to view all information for one respondent (seeing all the years at a glance), compare respondents, etc.

It seems like creating a table for each year isn't necessarily the best strategy. What would be a good way to structure this data for the database? I've considered just having a table that's basically the respondent ID, survey year, variable name, and value name. The difficulty there is that the values would range from simple boolean fields to fields with extensive amounts of text, so standardizing that field would be tricky (and probably very inefficient).

Any thoughts/tips/resources on where to turn?

Thanks in advance.

3 Posts
Login to add your message