Saturday, July 11, 2015

Deceipt, the biggest of OrientDB problems

After my counter reply to the OrientDB CEO I wasn't expecting to write any posts soon but the last announcement from Orient Technologies compelled me to write again, so here we go.

On July 9, Orient Technologies announced Our Take on NoSQL DBMS Benchmarks and the New OrientDB Performance Challenge which I invite you to read after you finish reading this blog post. In it Orient accuses an OrientDB "competitor" of not following their suggestions and publishing partial results. This is said without giving details, references, quotes or any evidence leaving the reader to imagine who the competitor is and what has the competitor published. Unfortunately, the announcement doesn't tell the full story and in this post I'll provide the context that led to Orient's announcement and also expose their deceitful marketing, perhaps the worse of many OrientDB problems.

Note: usually I provide links for every quote and screenshot I share but given OrientDB is reluctant to name their competitors in their announcement I'll abstain from providing links to the competitors websites, HackerNews articles, etc as to not reveal their identities.

Context: OrientDB Competitor's NoSQL benchmarks

June 4: OrientDB Competitor publishes NoSQL benchmark

Before the OrientDB benchmark was published, the OrientDB Competitor published an open-source benchmark comparing their multi-model database to reference GraphDB and DocumentDB products. This benchmark raised some interest and a few people asked for OrientDB to be included:
Rasmus: I'd like to challenge you to include OrientDB in that comparison - it uses a similar multi model, is ACID compliant, scales horizontally and vertically in a similar way, has SQL-like language and supports joins
codewithcheese: How does OrientDB compare?
crudbug: why OrientDB left out ? I would love to see the comparison.

June 11:  Competitor adds OrientDB to NoSQL benchmark

Before engaging Orient Technologies, the competitor added OrientDB to the benchmark using v2.0.9 and obtaining the following results:

NoSQL benchmark chart #1
Figure 1 - competitor's NoSQL benchmark including OrientDB v2.0.9

June 19: OrientDB makes improvements to the open-source NoSQL benchmark

After the benchmark came out, Orient Technologies fixed some OrientDB problems and added optimizations (examples: 1111e89, f61d6f4, 194907b), released v2.1-RC4 and made a pull request from their branch to update the competitor's NoSQL benchmark.

June 25: OrientDB Competitor updates NoSQL benchmark with improvements

After the Orient Technologies changes, the competitor merged their pull request and updated the benchmark results to reflect the improvements, running it against OrientDB v2.1 RC4. This time the competitor also added absolute values in addition to relative results.

NoSQL benchmark table
Table 1 - revised competitor's benchmark results against OrientDB v2.1 RC4

NoSQL benchmark chart #2
Figure 2 - revised competitor's benchmark results against OrientDB v2.1 RC4

OrientDB's take on NoSQL benchmarks

July 9: NoSQL benchmarks graph by OrientDB

Two weeks after the Competitor published the updated results, OrientDB reacts and publishes their own results running the same tests. Alongside their results they include the original (not revised) competitor's OrientDB results:

Figure 3 - OrientDB team results
*Note [from OrientDB]: We could not reproduce the performance numbers shown by the competitor in the initial tests. Unfortunately, only percentages were provided, so we trusted their numbers and used them to derive the other values according to the performance difference.
The Orient Technologies presents competitor results obtained against OrientDB v2.0.9 (Figure 1, orange bars) alongside the results the OrientDB team obtained against v2.1 RC4, v2.1 RC5 and v2.2 Alpha showing very different numbers, obviously. Not only that, OrientDB split the neighbours results while they were merged in the original results and did the opposite for single write tests. All this leads to a very confusing oranges to apples comparison.

Orient Technologies says they were unable to reproduce the competitor's results but ironically the competitor's revised results (Figure 2, orange bars) are not too far from OrientDB's own v2.1 RC4 results (Figure 3, yellow bars). Shortest path: 103% vs 128%, Single Read: 223% vs. 361%, Aggregation: 3892% vs. 3394%. In the first two cases the competitor results even benefit OrientDB.

A couple of other users in the OrientDB Google Group have noticed this and reacted:

Dário: it seems the results you published marked as "OrientDB test by competitor" are outdated as they've been corrected in 25th June. Your neighbours results don't seem to be comparable with theirs as, I believe, they are adding neighbours1 + neighbours2, while in your article they are separated. Anyway, if you use the newer article as source their OrientDB results seem very close to your OrientDB 2.1 RC4 results.
Luca (OrientDB CEO): The blog post shows how performance changed dramatically from the very first results when we weren't contacted to tune the OrientDB implementation.
I hope next time a vendor wants to create a new benchmark to showcase his product, he will contact the other vendors before publishing completely inaccurate results. While I understand that this is just marketing on their side, it's important to be fair and correct.
Scott: the second report results from competitor should have been used in the result set. It seems they are trying to be fair. The same should be returned. You refer to them reworking the benchmarks, yet by including the incredibly bad results of their first test and not the second set of results, you are sort of committing the same marketing whitewash you accuse them of doing. Not really a noble gesture.
No further reply from the OrientDB CEO.

Key takeaway: Orient Technologies shows competitor's outdated results, gives the impression those results are not reproducible and compares them with their own results obtained with newer versions of OrientDB. All this without acknowledging their competitor have updated their results which incidentally are close to the ones OrientDB got.

July 9: Noteworthy OrientDB statements from the NoSQL benchmark announcement

Of course, it’s understandable that a vendor would never want to publish an update to a benchmark where a competing product’s performance improved by orders of magnitude, therefore providing free marketing for their competitor.
Their competitor did update the results (Figure 2) and in some tests OrientDB does come out with better results.

We won’t mention products by name here, as the goal is not to show that we are faster, but to highlight how things can dramatically change when the vendors are consulted. 
It's true and impressive how much OrientDB was able to improve their own performance after being confronted with this benchmark. Unfortunately OrientDB is not being transparent regarding the results obtained by their competitor nor is open enough to allow its users to peruse the competitor's results independently.

We’ve also used the new official Node.js driver.
This may suggest the new official Node.js driver has better performance than the previous driver but if you read Why you should avoid OrientDB and peruse OrientJS' commit log you'll notice that until today no code changes have been made since it was forked from the original driver. Essentially, only the driver's and author's names were changed.

we’ve also made more optimizations here and there (all available in the latest versions)
This is true and while it's commendable it's a bit scary that Orient Technologies is giving priority to looking good in this benchmark instead of focusing on OrientDB problems: v2.1 has 55 open issues and 24 of them are bugs. Not only that, the latest OrientDB test code coverage shows only 58% line coverage and this performance fixes were done very quickly. Can we be sure this won't create more stability problems?

Unfortunately, only percentages were provided
This is true for the first set of results but the problem is Orient Technologies omits that the vendor did provide absolute values for the updated benchmark (Table 1).

Regarding the “neighbors2” test, well this use case has little value in the real world because when you do a traversal of neighbors, you’re interested in retrieving actual information about the neighbors instead of just their ID or key.
Once more the OrientDB team is not transparent and forgets to mention that one of the Competitor's employee has stated:
We had started with a version fetching the whole documents, but some of the databases did not survive this tests. However, now there are newer versions available. So, I can rerun the tests using neighbors with data.

Finally, benchmarks many times put production quality code against early-stage, super-optimized versions that are compiled to be extremely fast (but not ready to be used in a production scenario yet).
Even though this is not false, the OrientDB Competitor released the version they tested on June 23, before presenting their revised results, while OrientDB performance tests have been done against v2.1 Release Candidates and v2.2 Alpha. So who is benchmarking against early-stage, super-optimized versions?

July 10: A noteworthy OrientDB CEO comment

In the aforementioned OrientDB Google Group thread, the CEO wrote:
I hope next time a vendor wants to create a new benchmark to showcase his product, he will contact the other vendors before publishing completely inaccurate results. While I understand that this is just marketing on their side, it's important to be fair and correct.
This comment shows a double standard, in the blog post titled OrientDB the fastest GraphDB available today? the OrientDB CEO benchmarks OrientDB against another GraphDB without providing the code or inviting the GraphDB's team to improve the benchmark. So who is fair and correct?

July 10: A fake account, again

As revealed in my earlier post OrientDB has used a fake account to promote their product in the past. A few hours after OrientDB published this announcement on Hacker News, the following comment showed up:

alpharomeo: Amazing product and great response time from the team at OrientDB. The Jumpstart Package is a fantastic cushion for the transition of a product from POC to Production

And if you check alpharomeo's profile you'll see:

Hacker News: alpharomeo

This account was created just before the comment and has made no other comments besides this one.

July 17: The explanation behind OrientDB v2.2-alpha's wonderful results

OrientDB added the following update to their announcement:
Since OrientDB is not an in-memory DBMS, in order to use the large amount of available RAM on the test machine (60GB), we tried the new Command Cache with OrientDB 2.2-Alpha.
This statement helps understand two things:
  1. why memory was mysteriously absent from the results published by OrientDB;
  2. how OrientDB managed to obtain such impressive results.
Unfortunately, regarding point 2. it's important to note that the benchmark was meant to test the performance of each DB's algorithms and not their caches. The benchmark's creator had explicitly said before As some candidates uses caches, some query caches, we run the tests just one time. We want to show the computation performance. Effectively, OrientDB performs better in the benchmark because it used a cache while the others didn't, in other words OrientDB cheated and they admit it themselves.


In the OrientDB's CEO response to one of my posts he said I will address this topic exclusively with my code contributions and by providing outstanding support to customers. I assumed Orient would focus on existing OrientDB problems and refrain from further marketing gimmicks but the latest announcement proved otherwise, unfortunately.

Let's highlight the positive things first:
  • OrientDB made significant performance improvements in a short amount of time;
  • OrientDB collaborated in a competitor's benchmark fixing problems and submitting improvements;
  • Orient Technologies is right in saying their competitor should have contacted them and give them the opportunity to improve the benchmark before publishing results.

And now the less positive things:
  • OrientDB accuses their competitor of not implementing their suggestions without detailing or substantiating such claim;
  • OrientDB is not transparent enough to reveal their competitor's name, results or articles; 
  • OrientDB published a confusing graph which does not portray their competitor's results accurately;
  • OrientDB lied by omission several times when they didn't reveal their competitor's revised results (both relative and absolute) and when they failed to mention the new driver has no performance benefits;
  • OrientDB is prioritizing performance improvements for this benchmark (which real world usefulness is contested by them) over OrientDB v2.1 existing problems;
  • OrientDB CEO disapproves their competitor publishing the results without consulting him but forgets he has done the same in the past;
  • A fake Hacker News account was used to further promote OrientDB's announcement.

Finally, this Orient Technologies announcement seems vengeful, almost like they are trying to punish their competitor, which only tarnishes OrientDB's reputation. Orient needs to reassess their PR if they want to be taken seriously and if they want OrientDB to be respected.

In the past I've recommended OrientDB to colleagues and friends but with the problems exposed in my earlier post and with the continuous deceptive tactics I can no longer do the same.

Don't take my words at face value, click the links, search, read, investigate, peruse the OrientDB issues in GitHub, examine the OrientDB google group threads and make your own opinions.

UPDATE 1: some good news from OrientDB, hopefully this means a change in marketing practices:

UPDATE 2: added the update OrientDB wrote on July 17, you can find it just before the conclusion.

-- a disgruntled OrientDB user

No comments:

Post a Comment