posted on October 28, 2000 08:54:49 AM new
I got got an interesting tidbit from an anonymous source at eBay.
eBay's problems were caused by a Oracle database software bug. eBay is upgrading to Oracle8i with the distributed and partitioning options, and eBay was bitten by an Oracle problem entirely out of their control. As eBay moves into a decentralized environment they rely on a multh-threaded server and they have has sporadic errors with the MTS dispatchers and listeners.
Also, my source says that eBay expects further outages over the coming weeks as they work with Oracle to apply the patches.
So don't blame Meg, blame Larry Ellision.
Personally, I think that it is remarkable that eBay runs as well as it does. It is one of the world's most sophisticated online databases.
posted on October 28, 2000 09:03:00 AM new
Yes, but consider the source of this information. I'm not saying it's not true, but in the past eBay has been known to shift the blame in a heartbeat.
If this bug is truly "entirely out of control" then it's hard to imagine why they wouldn't know about it in the first place. Are they beta testing for Oracle, using *our* auctions as the guinea pigs?
[ edited by kathyg on Oct 28, 2000 09:03 AM ]
posted on October 28, 2000 09:10:14 AM new
Good question,
I am running Oracle8i (release 8.1.6) on my Internet systems and I have had to apply dozens of patches to keep the RDBMS engine running.
Unfortunately, eBay is also migrating from Oracle 7.3.4, a relatively stable release to Oracle8i, whitch has only been in production about eight months.
A review of Oracle's Metalink shows lot's of itar's that Oracle has not yet created patches.
The problem is that eBay is using the more sophisticated multi-master replication functionality of Oracle8i.
Base on what I hear, there will continue to be unplanned outages as eBay fixes the bugs.
Again, eBay is helpless when Oracle bugs are concerned. They no not possess the source code, and they are at the mercy of Oracle to write the fixes.
Here is the text from the eBay announcement board discussing the issues:
Dear eBay Community:
Over the last few weeks, we have been hearing from community members about system performance and periods of site unavailability. We thought it would be best to talk about these issues today, rather than wait until our next formal tech update.
Some of you may recall a message we posted during the summer about a major system upgrade to meet the growing demands of the eBay marketplace. At that time, we explained our need to continually make changes to improve "headroom" - the ability of the system to perform better than it needs to for future growth.
The upgrade has involved a database split and a move to Oracle 8i. This transition has been mostly seamless and transparent to our users. But most recently the transition has impacted your ability to use certain features and functions.
Database Split
One of the biggest projects to improve headroom has been what we're calling a "Database Split." The eBay system is composed of many "databases" (collections of data). These databases contain information about items, users, feedback, accounting, etc. One or more of these databases is involved for any given "transaction" between a user and the eBay system - viewing an item, listing a new item, placing a bid, viewing feedback or one of hundreds of other tasks.
Over the past 9 months, we've refined eBay so that it's now possible to split these databases between systems. The result is that the load is now spread out across several systems. For example, if there is a problem with one function, the rest of the site is still available.
The first database split off was the accounting systems database that records fees, and processes invoices, credits, and payments - making our monthly invoicing process down to a day. Considering that eBay has over 19 million registered users (making us larger than New York City), this was quite an improvement.
Over the summer, eBay began to split categories onto their own systems that will allow us to continue to grow the number of categories and items. Even though items will eventually be split across many systems, you'll still be able to view listings and search across all categories just as you do today.
Oracle 8i
In order to support these ongoing category splits, we had to make another major upgrade from Oracle 7.3.4 to Oracle 8i. The upgrade will allow us to make certain databases of information available across systems. As this upgrade "bakes" into the system, we have come across areas of contention during high traffic times – in particular Sunday evenings but fortunately not every day.
During these times, we have disabled certain features, such as Seller Search or the Seller tab on My eBay, in order to maintain the site’s functionality. While all features are important, we try to temporarily disable those that have the most significant impact to system load and those that minimize impact to people’s ability to find, view and bid on items.
We will continue to fine-tune the system for optimal performance with Oracle 8. It's difficult to predict, but we expect this transition to be completed in about 7 to 10 days during which there may be periods of certain feature unavailability. Our goal is to make these improvements in advance of the holiday season.
posted on October 28, 2000 09:18:14 AM new
Well All I know is when other businesses get bugs they call for pest control.
it my understanding that Oracle bugs are like the common termite but work faster in some cases turning a database into dust in a matter of seconds
and if your haveing a health inspection and the health department finds bugs they give you 30 to 45 day to get rid of them.
so I say lets give ebay the weekend if it cant get rid of the bugs whats say we all turn them into the health department.
If I was one of the Ebay sellers selling computer hardware or software I would be very concerned since they stand the bigest chance of being contaminated by these pests.
posted on October 28, 2000 09:19:56 AM new
It's not uncommon to have software bugs, hardware problems, ISP outages or any number of combinations that might affect a website.
I believe that many of eBay's troubles are more implementation based--they didn't apply a Sun patch once and that caused an outage, they move features and services out before they are ready for prime time. Some of this is just inevitable--you won't know what problems may occur until it is actually in use: every action that a computer takes is based on something it is programmed for and those that it is not programmed for as well. Programmers apply as many "what-ifs" to the code as they are aware of. eBay users are guinea pigs, but so is eBay.
[ edited by mballai on Oct 28, 2000 09:24 AM ]
posted on October 28, 2000 09:39:13 AM new
mballai:
Good point. Oracle has millions of lines of code that can interact in a million-factorial ways.
However, I disagree about the eBay failure to apply that Solaris 2.6 patch. OS patches can CAUSE unexpected outages, and it is never a good ideal to apply an APAR without a known problem.
However, the outages could have been avoided.
There are world-class Oracle DBA's that eBay could use to ensure that they never have downtime during database reorgs. I hear rumors that they are unwilling to pay the $400/hour that the top DBA guru's command.
You are right. eBay shares the blame because they are too cheap to hire a top-gun Oracle DBA to ensure availability.
On a more onerous note, there will be more outages next week as the 8i migration continues. . . .
Cheeses of Nazareth
[ edited by cheeses on Oct 28, 2000 09:45 AM ]
posted on October 28, 2000 09:43:50 AM new
mballai: I couldn't agree with you more on this one:
"I believe that many of eBay's troubles are more implementation based"
Additionally, I can't help wondering if the non-technical powers-that-be are not overriding the recommendations made by the engineering staff, a common problem. Except in this case, it steals directly from the pockets of thier customers.
posted on November 1, 2000 04:51:28 AM new
I hear that the eBay outage this morning is raleted to another bug in the upgrade from Oracle 7.3.4 to Oracle 8.1.6.
I hear that eBay has no control over this bug and must wait for technical support from the vendor.
posted on November 1, 2000 07:16:22 AM new
I don't have near the problem with them being down as I do with beiung treated like a fool.
The attitude seems to be - Of course we lie - we're a business now! Could not use system for a solid 4 hours this morning and they won't admit it. I know it cost them $500 to $600 just from me this month as I did not keep as many listings up anmd my sales were down.
posted on November 1, 2000 07:19:32 AM new
THE PROBLEM IS EbAY - PERIOD - after 5 + years on eBay I have heard every excuse in the book as to why they are down, and guess what - it is NEVER EBAY'S FAULT!!!!!
AMAZING, isn't it.
Odd, how every major stock brokerage company, which has zillions of hits daily, and Yahoo and others don't have the endless problems of eBay......
Can you imagine what would happen to an on line stock broker if they had the problems eBay has!
eBay is headed by a group of folks who have never learned to accept responsibility for their problems, and actually believe we are so stupid that we will endlessly believe they excuses. In other words, they are insulting their users by their endless excuses, actually believing that we are stupid enough to believe them!
posted on November 1, 2000 10:38:58 AM new
captainkirk:
I agree to a point. However, I empathise with eBay because a vendor software bug is totally out of their control.
Oracle is the world's most popular (and complex) database.
Don't kid yourself into thinking that these outages were not unforseen. Sure, by spending millions for a completely replicated server system they could have been avoided, but eBay made a business decision based on economics.
True, the Wall Street systems have continuous availability, but they process billions of dollars every day.
The downtime cost to eBay is less than $100,000 hour.
It does not make econimic sense to spend tens of millions for continuous availability.
posted on November 1, 2000 11:02:17 AM new
I do think that eBay should be spending whatver money it takes to be up and running save for scheduled maintenance and upgrades. There should be something in place that provides rapid recovery from errors. When I was working on a big program, there was a error program that sent you a "Hey Stupid Memo" your program blew up on line XXX of this function and the memory variables involved were .... It didn't take very long to diagnose the cause and fix it. By the time the program was ready to run for real, there was very little downtime.
posted on November 1, 2000 11:49:12 AM new
My point is that I don't see why ebay hasn't been able to make the appropriate changes to figure out how to make their system stable. And I don't think that had to cost "tens of millions of dollars for continuous availability".
They could have selected different software or hardware. Or maybe stopped adding bells and whistles. Or changed their business practices and/or auction rules. Whatever they need to do.
I empathize with a company that gets hit with an unexpected vendor bug the first time. Maybe the second time. However, on the "fool me once, shame on you, fool me twice, shame on me" theory, after 5 years of ineptitude, they need to do something different.
So, now for the relevant questions: how many servers does ebay have? How many do they need? (might not really need any more!) Are solaris servers the most cost effective solution? Should they have increased the complexity of their auction system while it was still unstable? Do their staff of IT folks the right skills? Should they have chosen a different database than oracle?
Note that many of the alternative choices ebay could have made would have resulted in NO additional cost.
Now THIS is the education we really need, but alas, won't get from ebay.
posted on November 1, 2000 01:51:11 PM new
I have an admittedly simplistic view and comes around to point the finger at eBay: Why are they moving to a cutting-edge system that is filled with BUGS?!?!?
This is just another example in an endless list of companies that plunge into mass-marketing without given sufficient thought to how they are going to serve the customers after the customers arrive. Other examples include: Toys-R-Us and their internet selling disaster of last Christmas;AOL and their disaster of a couple years ago; ATT WorldNet, which promised 24-hour customer service during their marketing blitz and couldn't deliver (without a wait of 90 mins on phone).
I would much rather see eBay split into manageable entities than continue with this massive, amorphous buy-anything-old-new-antique-fake-whatever. A split would accomplish at least 3 things: They wouldn't need cutting edge database technology and could rely on proven software; the smaller "forums" could have distinct personalities and buyers would be able to find items! which is getting harder and harder these days.
posted on November 1, 2000 02:09:49 PM new
Ah - it would take a crystal ball to answer all those questions. You can't predict how well new technology, new hardware and new software will interact and every employer in the world is hunting for that rare diamond of a tech employee that will tell them those answers for a salary.
LOL
Oracle is a very very new kind of database and just finding oracle DBA's that have any experience over a semester of classes is pretty well impossible. To know those types of answers, you have to have years of experience with just that kind of set up, and Ebay's set up is so huge and hard hit with customers that they are having to find out the hard way how this all works.
For heaven's sake - give them credit for what they can manage. There is no one else on this planet doing what they are doing and so far it is a miracle that they manage to keep from having to limit everyone to 3 auctions at a time or doing one search a day.
Crystal ball? Hardly. The fact that you would be unable to answer them without such a device doesn't mean that others would be so constrained. It isn't that hard to do a price/performance comparison of servers, for example. Or to look at load factors to determine if more are needed.
And as far as Oracle being "very very new"...well...guess it depends on your relative timeframe. Compared to the age of the universe, they are. Compared to the age of the internet, they are very, very old. Just for the record, Oracle is 20 years old.
And your statement that "no one on the planet is doing what they are doing" is, in some sense, quite apt...because others are doing it better. See, for example, airline reservation systems.
[ edited by captainkirk on Nov 1, 2000 02:45 PM ]
posted on November 1, 2000 03:04:57 PM new
I have to agree that ebays performance has left a lot to be desired. However, that is their issue to deal with in what ever manner they desire. If they continue to conduct business in this manner I am confident they will come to a juncture in time were they realize things should have been dealt with differently. As we all know they have built their biz on the efforts of online sellers. They will soon feel the Wind of Change! With scores of dedicated ebay sellers jumping to Yahoo and getting their feet wet. It is only a mattter of time before this will effect ebay's market share. Until they come to this realization. Our concerns have no impact! Only when they feel the negative effect on the bottom line, will they show any interest in what we have to say!
Oracle DBA's are a rare breed, and it takes years of experience to master Oracle.
I know a DBA in San Jose who makes over $300,000 per year, and he told me that eBay cannot justify a Computer Scientist who earns more than a Vice President.
Consequently, they hire cheap beginner DBA's (only $100,000 per year), and look what they get.
Oracle is a GREAT database when properly administered, and lot's of folks have made the transition from 7.3.4 to 8.1.6 without service interruptions.
posted on November 1, 2000 04:15:36 PM new
No a plane reservation system is not anywhere close - I can't believe what some people think is comparable.
Who uses a plane reservation system - clerks and travel agents?
Does the general public use it? a lot? My sister spends every night, all night messing with Ebay - so do lots and lots of other people I know. The amount of traffic on that system has to be horrendous. The only thing I can think of that has close to that much use is AOL and I'll bet AOL has probably half the traffic. No business system would possibly get that much traffic. Only systems that are used by the public would and then only if they are used really hard as in something like an average of 100 transactions a day for every internet user in the states (that is just a really really off the wall estimate - so don't try to argue the accuracy please).
Tthe Oracle version that handles large systems ~is~ new - comparitive to other large database systems which are decades old. I can't think of any other new semi-reliable DB system for mainframes, but I'll admit that it's not my job to know them. Oracle was not written for large and it doesn't work terribly well yet that way from what I hear. So if someone else's database is running fine, it may be that the relative size is not comparative.
I'm not going to second guess Ebay's choice of DBA's experienced or not seeing how hard it is to find any experience in it, even if you do pay the price.
At any rate, in my experienced opinion, Ebay has unique overwhelming problems and I think the job they are doing overcoming them without the consumer being too badly inconvienced has been good. I don't think the average consumer/seller has the slightest room to complain about down-time, nor the slightest conception what it takes to keep a dynamic system going.
Try building your own ISP from scratch and running it for a while and then perhaps you can judge Ebay.
posted on November 1, 2000 06:34:57 PM new
It's baloney to say that eBay cannot afford to hire a primo DBA. Do the math...if eBay loses a couple million dollars from an outage that a $300,000 DBA could prevent, I'd say eBay would do it. I doubt they get cheap on IT, but they may still not be investing enough. IT runs their business, if they fail to take care of their goose, the golden egg is gone.
posted on November 1, 2000 07:08:42 PM new
"Oracle is a very very new kind of database" ?????
When I quit my medical research position in 1993 our database was Oracle and had been for at least 3 years. Perhaps this particular database of Oracle is new, but Oracle certainly isn't.
I agree that given the financial stakes and eBay's ability to pay, they certainly could get a DBA of top notch quality. They are out there, and if eBay was seriously interested in a smooth running operation and serving their customers, they would find one, pay a king's ransom and hire the dude (or dudette).
posted on November 5, 2000 11:08:13 AM new
Since ebaY now knows (and we all know that they have known since at least August about these problems) that the site is going to be dicey for the near future - they should very simply auto-refund fees on every auction running on their system every time they have a glitch in a critical function that effects the ability to search and bid - NOT just an outage.
OR they should announce on the AB to the press an in an email to every seller on their system (the same sellers they send INVOICES to reliably) that they are having serious non stop problems that will make their system very unreliable for the next X amount of time, and that sellers will LIST AT YOUR OWN RISK. THAT is not ROCKET SCIENCE, and a very simple thing to do. And it doesn't cost them anything to do it. But oh yeah, it will probably cause their stock to drop, and since MEG is selling 600,000 shares they wouldn't want to do anything to cause the value to go down now do we???
No, they would rather lie, SCREW all their sellers, take their money for a service they KNOW they can't provide, when ebaY knows that only a small portion of their sellers KNOW the reason their items are not selling is because of ebaYs OUTAGES!!!
A company that proves again and again that they have NO ETHICS!
-Rosalinda
(my fingers are just not working today!)
TAGnotes - daily email synopsis about the Online Auction Industry
http://www.topica.com/lists/tagnotes