Monday, June 20, 2016

One Year of OrientDB Leaks

When I wrote my first article one year ago I thought it would be my last but then stuff happened... OrientDB's CEO made false allegations, OrientDB cheated on a competitor's benchmark... etc. Additionally readers still send me encouraging e-mails, comments and the occasional question about my current feelings towards OrientDB. All things considered it's appropriate to do an overview of last year and give a brief update.

June 20, 2015: My first and most comprehensive article about OrientDB issues

This is the longest and most comprehensive post I've written. In it I cover my personal experience with OrientDB and also the many events I've witnessed while using it.

Some key takeaways: 
  • OrientDB is unstable and plagued with bugs;
  • Paying companies are quitting OrientDB after months of use;
  • OrientDB team prefers to add new features instead of fixing existing bugs;
  • OrientDB Node.js driver's author recommends rival solutions;
  • OrientDB team resorts to a fake account to boost their product.

I also suggest reading the comments section where some readers give their own testimonials.

June 27, 2015: Rebuttal to OrientDB's CEO allegations about this blog

The OrientDB CEO read my first article and addressed it through his personal blog on a post titled 5 Wonderful Years on Github -- Thanks to the Community!. Since his post had unsubstantiated and deceitful remarks a response was due.

July 11, 2015: OrientDB prefers to cheat benchmarks instead of addressing performance

Back in June a GraphDB competitor created an open source benchmark to compare the performance of different DBs. As usually happens in these cases, vendors choose data sets and tests covering their DBs' strengths and, unsurprisingly, their product beats the others. OrientDB was eventually added to that benchmark and did not fare well. Orient Technologies didn't like it and quickly released OrientDB v2.2 Alpha to address its poor results publishing a sour article about it. All this would be OK if it wasn't for the fact that OrientDB cheated and achieved their results by enabling a command cache while keeping the other DBs' caches turned off, leading to an unfair comparison.

May 8, 2016: Gartner confirms OrientDB's reference customers are struggling with bugs

In May I was made aware of Gartner's 2015 Magic Quadrant full report. Even though I shared it a bit late it's still relevant since one of the cautions pointed out by Gartner gives credence to my previous articles:

CAUTIONS
Support, documentation and bugs: Orient received one of the lowest scores in our survey for support and documentation. Furthermore, a high percentage of its reference customers cited bugs in its software as a problem.

May 18, 2016: OrientDB releases v2.2

"After 11 months of development and QA testing" OrientDB released it's long awaited minor version update 9 months behind schedule. This release promises a lot, in OrientDB's CEO own words:
OrientDB 2.2 is packed full of new features (I’m sure you’ve all heard of Teleporter), though the main focus for this release was to strengthen security and improve stability & performance.
In a subsequent article OrientDB's CEO goes further and says "we added new features we knew our users would be excited about: like the Incremental Backup, Load Balancing, Command Cache, Parallel Queries, Sequences, Reactive Live Query, Pattern Matching, Teleporter" but wait, there's "much more".

This is ambivalent, was the focus on security, stability and performance or was it in incremental backup, load balancing, command cache, parallel queries, sequences, reactive live query, pattern matching, teleporter and much more?

Considering the 11 months of development and QA testing I'm less than impressed with Orient Technologies releasing v2.2 with more than 57 bugs (and 129 issues) already open and assigned to the next hotfix release - v2.2.x.

OrientDB v2.2 bugs at release time
OrientDB v2.2 bugs at release time

All this insistence in pushing new features makes OrientDB the Jack of all trades, master of none.

HackerNews

OrientDB v2.2 news found its way to HackerNews and reactions were not exciting:
throwaway2016a: Has anyone had any luck with Orient and large database? I tried to use it with a 80,000,000 product database with about 60,000,000 updates a day (edges between sellers and products labeled by price) and it was unusable with the same sizing as a mySQL database that handled it easily. It seemed great though. Especially since neo4j open source is quite restrictive (clustering is a commercial only feature not available in the OSS version).
SliderUp: I had the same problem with a fairly 50/50 read/update database, about 20M reads/20M updates a day. Brought it to it's knees. Postgres handled it no problem.
pendexgabo: I've had several issues with the "distributed" setup among others.
And of course, OrientDB reacted in its customary way by creating a fake account to boost its product:
OrientDB HN fake accountOrientDB HN fake account

May 23, 2016: OrientDB publishes v2.2 code coverage and bug count

Soon after the release Orient Technologies published an article titled After 11 months of development and QA testing, we’re thrilled to finally release Production-Ready OrientDB v2.2 GA! tackling Code Coverage and bug count which, by coincidence, were two subjects I wrote about on May 8.

Code Coverage

Regarding Code Coverage, I appreciate OrientDB's willingness to talk about it and show some numbers as it's definitely a step in the right direction. An excerpt from the article:
In this period [11 months], we fixed about 2,000 issues and increased the code coverage by +11 points (from 55% to 66% of covered lines as we speak)
This doesn't tell the full story as they don't show the code coverage progress throughout the 11 months. I've collected and published these numbers earlier on my July 11 and May 8 articles and here is how it looks:

OrientDB Code Coverage
* Even though Line coverage remains unchanged today, Instruction, Branch and Class coverages improved 1% since May 23.

The +11 points rise only happened in the last two weeks before publishing the article. The fact that OrientDB is finally tackling and even discussing this is positive but their methodology needs improvement. As it stands automated testing looks like an afterthought and not something they practice daily.

Their article goes further:
Why is the average coverage so far below 100%?
These are some factors to consider that impact the average:
  • The core engine has excellent coverage, however other components such as the HTTP API have less coverage. This brings the average down.
  • There are components needed for older OrientDB version compatibility that are not well covered by test cases. This brings the average down, but they are never used by the last version of OrientDB.
  • There are experimental components of OrientDB (not even documented yet) that have not reached maturity. They usually have few or no test cases. Many times, we stop development on them before the first RC. This also brings the average down.

And by "excellent coverage" the OrientDB CEO means 69%:

OrientDB Code Coverage breakdown
OrientDB Code Coverage breakdown

Bug Count

When it comes to OrientDB's bug count, the OrientDB CEO stated:
A young player in the DBMS space could seem like a better product due to their low number of bugs. Or a simple product that handles a few use cases can seem like they have less issues compared to one that is more complex. However, the more people that use a DBMS and the more complex the functionality and number of possible use cases, the more bugs it will have.
Agreed, it's reasonable to say the more a piece of software is used the more of its functionality is tested and hence there will be more opportunities for bugs to be encountered in its crevices. In other words, the more popular a DB is the more likely it is for bugs to be found.

But then the OrientDB CEO offers absolute bug counts from different DBs without comparing or mentioning each DB's popularity. I'm not sure of good DB usage statistics but since OrientDB trusts DB-Engines Rank as a measure of popularity lets use it to normalize bug count and improve the DB bug count comparison.

OrientDB Bugs are unusually high for its popularity
OrientDB Bug count normalized
Note: To keep things clear, this is not the holy grail of DB bug comparisons, neither I'm saying DB-Engines popularity score is the best usage indicator but this table does offer a better picture of bugginess in the DB world than comparing absolute bug counts as OrientDB does.

By using DB-Engines popularity score to level the comparison one thing becomes apparent: OrientDB has around 4x more bugs than its competitors' average. Yet they make the following statement:
OrientDB is way below the average. Comparing our 343 bugs with 5,252 of MySQL means we have only 6% of their amount.
If OrientDB wants to be below average they should aim for less than 80 bugs instead of 343. And if they want to beat MySQL's bugs per popularity they need to lower their bug count to 22. Or to put it another way OrientDB has 1509% the amount of bugs MySQL has when adjusted for popularity.

Taking my own experience as example I've used MySQL and other SQL DBs for over a decade and I never had to submit a single bug report. When using OrientDB I filed close to a dozen issues in less than 10 months while performing little more than CRUD operations (not even counting known issues I've stumbled upon).

Full disclosure, on June 1 OrientDB added the following to their article:
UPDATE: As of June 1st, we have 270 bugs (73 bugs fixed in 2 weeks) which is also 7 less than Cassandra.

Conclusion

What's the situation one year later?
On one hand we have better security, better code coverage and more openness to discuss testing and open bugs, on the other hand we see Orient Technologies committing the same two old sins: deceipt and focusing on new features instead of bugs.

Why so many new features amidst a high bug count and complaints of instability and lack of performance?
I can only see one reason: marketing. New features are sexy, developers like developing them, the press loves covering them and people enjoy discussing them. Bugs are boring. Not only that but more features lead to comparison tables like the one below where newcomers get impressed by the wealth of features OrientDB offers. What newcomers don't realize is that OrientDB features lack the maturity other DBs offer - they certainly fooled me.

OrientDB features comparison
OrientDB features comparison

You can put lipstick on a pig but it's still a pig. What OrientDB needs is not new half baked features but a deep intervention to address its root problems that linger on.

But is v2.2 the silver bullet that will make OrientDB stable?
I don't think so. I've started using OrientDB around v1.7 and back then v2.0 was said to fix all issues. v2.0 came out and more bugs came with it while remaining unstable. A few months later v2.1 was heralded as the solution for all our troubles, they would focus on fixing bugs and was announced with "Rock Solid Stability" yet users, including reference clients, don't agree. See a pattern? v2.2 already started off on the wrong foot by having dozens of known bugs scheduled for v2.2.x so my hopes are low.

Is there a solution?
Yes there is. Exactly one year ago I proposed OrientDB to put a freeze on new features development and that suggestion still holds true today. OrientDB team should spend 6 months exclusively focused in fixing bugs and improving performance (71 open issues). If they are able to this OrientDB will become a stable platform by 2017.

Would I recommend OrientDB?
I have no issues in telling people to try OrientDB for small/pet projects, or for experimenting with graphs and perhaps some small/medium research project. If this fits your case go for it, you'll have fun. But if you are looking for something that will eventually end up in production or in some mission critical environment do not use OrientDB, OrientDB is not ready for production yet.

OrientDB's problem is not so much with Orient Technologies vision or ideas, they are clearly intelligent people, the problem lies in their methodology, over-ambition and unwillingness to compromise. Marketing instead of engineering is commanding their development strategy and that is clearly failing.


As always don't take my words at face value, click the links, search, read, investigate, peruse the OrientDB issues on GitHub, examine the OrientDB google group threads and make your own opinions.

Have you been using OrientDB for more than 6 months? Let me know your story in the comments or send me an e-mail.


-- a disgruntled OrientDB user

Sunday, May 8, 2016

Gartner Recognizes OrientDB in 2015 Magic Quadrant but...

I've been laying low as neither me or my team have been actively developing anything with OrientDB hence there hasn't been much to report. But recently I've come across an e-mail from an OrientDB Leaks' reader which I can't ignore so here's the gist.

Back in October 2015 Orient Technologies published an article titled Gartner Recognizes Orient Technologies in 2015 Magic Quadrant for Operational Database Management System which, as usual, paints a very positive picture of OrientDB. The aforementioned reader gave me access to Gartner's complete report (for paying customers) and in it they refer the following caution about OrientDB:

CAUTIONS
Support, documentation and bugs: Orient received one of the lowest scores in our survey for support and documentation. Furthermore, a high percentage of its reference customers cited bugs in its software as a problem.

In other words OrientDB's reference customers confirm my criticism.

This report was published 7 months ago and if you are wondering if it's still relevant today I invite you to check OrientDB's 1210 Open Issues, from which 377 are bugs:

OrientDB Bugs and Issues
OrientDB Bugs


Orient Technologies seems busier giving t-shirts for tweets than fixing bugs.

OrientDB's bug count wasn't the only thing getting worse, their latest test coverage shows 55% line coverage which is lower than 07/09/2015's 58% LOC.

People occasionally ask me if my opinion about OrientDB has changed over time. Because I have avoided using it I can't comment on recent versions but from the above, and from their newsfeed, they still seem too focused in marketing and new features.

Last but not least, a big thanks to the anonymous reader who sent me Gartner's report.

UPDATE 1: as @CoDEmanX has observed you can access Gartner's report at https://www.gartner.com/doc/reprints?id=1-2PMFPEN&ct=151013
UPDATE 2: OrientDB v2.2 was released (9 months behind schedule) and as usual the OrientDB team created a fake account, alphatech, to promote it on HackerNews.

-- a disgruntled OrientDB user

Saturday, July 11, 2015

Deceipt, the biggest of OrientDB problems

After my counter reply to the OrientDB CEO I wasn't expecting to write any posts soon but the last announcement from Orient Technologies compelled me to write again, so here we go.

On July 9, Orient Technologies announced Our Take on NoSQL DBMS Benchmarks and the New OrientDB Performance Challenge which I invite you to read after you finish reading this blog post. In it Orient accuses an OrientDB "competitor" of not following their suggestions and publishing partial results. This is said without giving details, references, quotes or any evidence leaving the reader to imagine who the competitor is and what has the competitor published. Unfortunately, the announcement doesn't tell the full story and in this post I'll provide the context that led to Orient's announcement and also expose their deceitful marketing, perhaps the worse of many OrientDB problems.

Saturday, June 27, 2015

OrientDB's lack of transparency

Some people have described my previous post as bearing hate and I'd like to clarify that the tone is disappointment. I'm still using OrientDB and I would very much like that all major issues get fixed. OrientDB's success would make my life much easier as it would save me from migrating our data.

With that out of the way, the Orient Technologies CEO has partially replied to my earlier post in 5 Wonderful Years on Github -- Thanks to the Community!, below is my counter reply.

Saturday, June 20, 2015

Why you should avoid OrientDB

Up until now I've been a silent bystander regarding constant OrientDB issues but recent events changed my mind and I've decided to intervene so others can learn from my experience.

TL;DR: OrientDB is like a book with a great cover and some interesting first chapters but once you reach the middle of the book you realize you've been mistaken and it's not all it promises to be. If you don't want to read the whole story scroll down to the conclusion. (if you don't read the whole post please refrain from posting comments).

3rd quarter of 2014: decision to use OrientDB

In the end of Q3 2014 my team agreed a graphDB would be a perfect fit for a new project and we started evaluating options. OrientDB came up on top in terms of features and because of its permissive license. Investigating a little bit deeper revealed a few stability issues (for example: Continual frustration with OrientDB durability and "With 1.7.x, they [issues] have been mostly around stability") but we thought v2.0 would fix all these issues... We've done a proof of concept, all looked good and we jumped with both feet adopting OrientDB. This would be a decision we would later regret. Below is a timeline of events and issues that changed our mind about the product.

Friday, June 19, 2015

About OrientDB Leaks

Lets start with what OrientDB Leaks is not. OrientDB Leaks is not about announcing upcoming features nor is a rumor website. And OrientDB Leaks has no affiliation with Orient Technologies.

OrientDB Leaks is about people's experiences in using OrientDB, namely negative experiences. It provides a medium that is not moderated by OrientDB where people can honestly share their experiences. Unfortunately, in the past, less positive messages have been deleted from OrientDB issues at Github. Here that won't happen.

Who am I and why don't I reveal my identity? I am an OrientDB user who has used the product for more than half year. Many people like myself can't share their identity due to professional and career reasons. That, however, does not change the veracity of what's written here and I'll do my best effort to substantiate all my statements.

With all this said I invite you to read my story about the OrientDB issues that made us give up.

Do you have a story you'd like to share about OrientDB? You can do it publicly or anonymously, just drop me a line.

-- a disgruntled OrientDB user