Stop using unrealistic - or overly real - data in your internal databases!

Concept

For years, test data was either taken from real customers (difficult during initial development, and problematic due to privacy concerns otherwise) or obviously faked (Testy McTester 1235-b).  During demos or internal product use, having realistic data is absolutely necessary to see how screens feel, how reports look, and to get a proper sense for how usable your software actually is.  Bad test data hides a multitude of sins - and can even interfere with performance testing, if its either more or less random than normal data - this can change the way that your database’s optimizer uses indexes.

Mockaroo provides (at this writing) 143 believable, random yet related fields that can be generated in batches of 1000 rows at a time (more if you pay a nominal fee) and used however you see fit (with the very reasonable exception that you can’t use them to feed into a competitive product).

Pros

Every field I’ve needed to fake up a dataset has been present.  The have all of the normal ones (people, places, businesses, phone numbers and addresses) and a lot of more obscure ones (VINs, credit card numbers, file names, et cetera).  Not only that, but when you ask for 15 fields they will generally make sense together (you won’t get an American street address with a Japanese country code, for example, even if they’re added as separate fields).

When presented on-screen or in reports the data “feels” correct.  Lengths vary naturally, and both testing and demos flow very naturally without having to worry about running afoul of privacy concerns, either legal or moral.

Cons

All of the email addresses generated by Mockaroo are believable, and many of them likely have real humans on the other end of them who won’t be overjoyed to get slammed by test emails.  Luckily there’s an easy fix to this - you can append “.test” to the end of all of them, which will keep the appearance of useful data while ensuring that they don’t ever get hit.

This isn’t something that Mockaoo will do for you, mind you, but its a fairly trivial change to the data before (or during) your importation process.

Depending on your use case you may be able to simply set your test environment to ignore email addresses that don’t match your internal domain, or similar, but I prefer to be extra-careful when it comes to potentially annoying innocent bystanders.

Conclusion

Considering how easy it is to generate mock data in whatever format you want - not just CSV and Tab, but more useful ones like custom JSON objects or SQL insert statements - there’s no reason not to use this service.  A token payment of $50 gives you a full year of faster access with the ability to generate up to 100,000 rows per click too, and is well worth paying even if you don’t need to in order to support such a useful tool.