Skip to main content

Opinion: Performance benchmarks are worthless, here’s how to make them better

Performance-benchmarks-are-worthless-Here’s-how-to-make-them-better

AMD is getting ready to launch its next-generation architecture (code named Trinity) and invited a bunch of us out to Austin to see it. I can’t talk about the technology until it is launched, but one of the events at the show was a head-to-head comparison between this AMD technology and Intel’s top-shelf products.In each test (including productivity, video enhancement, and file compression) AMD Trinity technology wasn’t just faster, it was substantially faster.

Though the demonstration was impressive,it also reminded me why benchmarks really aren’t that useful anymore. Not only do they fail to reflect what each of us individually do, they don’t factor in cost, device size, or design, each of which might be more important than any direct performance measure.

For instance, Apple hasn’t led benchmarks in years. Side-by-side with competitors, the iPad and iPhone actually tend to appear relatively slow (they often use older networking, processor, or storage technology). They are also relatively expensive, yet lots of folks still prefer them, suggesting benchmarks as they currently exist are worthless to these buyers. They rank other things higher.

So what would a perfect benchmark look like?

How do you work?

The perfect benchmark would be derived from an ongoing analysis of how you use your hardware. We all change as we age, and even change what we do from day to night,from weekdays to weekends, and on vacation, so the capture should occur over a period of time.

It should also look for critical points, like what annoys us and what thrills us — not only in terms of what we are doing, but what we are talking about. In short, factor in our social-networking activity in things like Facebook and Pinterest.

Finally it would rank all aspects of our interest and factor in cost, not only the cost of buying the product, but the cost in time of putting the product into service, maintaining it, and our sensitivity to down time.

Analyzing the device

Since it has proven impractical to go into a store and run a benchmark on a shelved PC, and impossible to do the same thing if we want to buy online, the ideal benchmark would also need to capture the performance of systems on the market. Against this objective data, it would also capture subjective data on design, expected reliability, and time to obsolescence. While the latter two could come from historic data (much like Consumer Reports does with its ranks), the design analysis would be based on what someone similar to you in terms of personality type and taste would rank the product.

Finally, given that we live in an online “cloud” world, a major portion of the data captured would need to be on the services the device connected to, the apps it would load, and the overall end-to-end user experience.

In the end, everything would be mathematically rendered.

The result

The result would be accessible on a site where you could go, log in, and specify either the type of product you were looking for, or enter a number of products you were looking at. The system would then give you a set of choices listing the key analytical elements of each. So if you saw something that wasn’t current, or you didn’t agree with, you could change the element and thus change the ranking.

You could see an overall ranking of around 10 products with some specific ones flagged: the lowest priced, the best match to you, and the most balanced (best value for the money as defined by your unique needs and tastes). This is also somewhat similar to what Consumer Reports tries to do, but more advanced.

You would end up with a list of top choices that would be more likely to thrill you. It could also analyze products you already own to flag when performance degraded to a point that would begin to irritate you, or when the extra performance of a new system was great enough to make it worth it for you – specifically based on your needs.

Benchmarks don’t have to suck

When I first ran into benchmarks, Intel was complaining that it built systems that were betterrounded, while AMD was using benchmarks to drive people to systems they would like less. Intel tried to get the industry to drop the benchmarks, failed, and now largely optimizes for benchmarks.

If you focus on what people want to do, you’ll provide a better experience, but still likely get slammed by benchmarks. At AMD’s event, the company was pointing to the reasons benchmarks suck.

I think the answer here is to create benchmarks that don’t suck. We have online tools that capture a ton of information about us to sell to advertisers, so it doesn’t seem to be such a stretch to use some of this technology to create a tool that makes us happier consumers. Considering all this information is compiled about usand should belong to us, it would be really nice if it were used to make us happier, rather than just milk us for money. This would be a way to do that. What do you think?

[Image credit: kk-artworks/Shutterstock]

Rob Enderle
Former Digital Trends Contributor
Rob is President and Principal Analyst of the Enderle Group, a forward-looking emerging technology advisory firm. Before…
SpaceX reaches 100K Starlink customers. Here’s how to sign up
A Starlink dish.

SpaceX has now shipped 100,000 Starlink terminals to customers who’ve signed up for the company’s internet-from-space service.

SpaceX CEO Elon Musk dropped the news in a tweet on Monday, August 23. It means the company has added 90,000 new customers to its beta service in just six months. The company opened Starlink to its first paying customers in October 2020 and it now serves 12 countries, with more on the way.

Read more
Intel claims Tiger Lake chips will make historic leap in performance. Here’s how
intel tiger lake 10nm superfin chips 01

Tiger Lake is Intel's upcoming line of processors, and at its 2020 Architecture Day, the company made some bold claims about the performance gains in this latest generation. Intel says Tiger Lake has a "greater than generational improvement" in performance over its predecessor, Ice Lake, resulting in the "largest single intranode enhancement in its history" and "performance improvement comparable to a full node transition."

In other words, Tiger Lake won't be your average generational update. You don't need to know what an "intranode enhancement" is to hear Intel's ambitions. For an industry as slow and iterative as processor design, a claim like that is enough to make you sit up and listen.

Read more
Here’s how to track Santa on his busiest night of the year
how to track santa during his busiest night of the year

Santa is making the final preparations for his big night out and you (or your kids) can track his whereabouts as he makes his way toward your home on Christmas Eve. Just make sure you’re asleep before he arrives -- and that you’ve put out the obligatory milk and cookies.

The two most popular Santa trackers come from Google and the North American Aerospace Defense Command, better known as NORAD.

Read more