Mandatory disclaimer - all views in this article are my own and in no way represent views of my employer or my coworkers.
Last few weeks I noticed several gposts about antiviruses, False Positives and how bad the situation is. For example, this essay from atom0s and this complaint (reg required) by mudlord. And then there is this epic rage by evlncrn8. 🙂
To understand why antiviruses work this way, you need to consider plenty of factors. So, let's take a quick look.
Why make antiviruses?
It usually starts with a group of skilled guys wanting to save the world. They make a great product, people like it, company makes some money, more people like the product, company grows even more and so on..
But as company grows, priorities change. The bigger and more popular the company gets, the more managers and investors it attracts. Those guys usually have no clue about technology behind antivirus. And they don't care about technology, they only see numbers and dollar signs everywhere.
And then the primary goal of company changes to making profit for shareholders.
What's with the UI?
Let's face it - readers of my blog are not the usual antivirus users. Antiviruses are used by everyone - from extremely skilled IT geeks to Granma Millie living in the retirement home. And this causes second biggest problem - big companies cannot make product just for skilled IT geeks, as nobody else will be able to use it. You can't make a product for the average user either. You need to make something that even Granma Millie can use.
And that's why most software products in recent years get dumbed-down - managers think that they need to do "inclusive designs" - so even the most retarded of users can use the product.
New shiny features.
One of the most common complaint I hear is that all antivirus products are becoming a huge bloatware. There are several reasons for that. First, product managers just don't know any better.They look at all competitors - if Company A has feature X, you need to have feature X, no matter if it actually makes sense or not. Second reason is that company somehow needs to sell new version of product. You can't say - this version is the same as the old one, we just changed colours and moved buttons around. No, you need to have something like "New version, now with features Z and Q!"
It's not the best way but it's certainly the easiest!
AV reviews and tests.
When you are purchasing a new car, you probably search for the reviews online. You probably do the same when you decide to move to new city, plan your vacation or make any other big decision. That's just normal.
And it's the same with antiviruses - most people will either get a recommendation from someone they trust, or they'll search for reviews online. So, the companies need to invest a lot in PR and make sure their product looks good in tests and reviews.
Testing methodologies most of the times are not representative of any real-life experience of ordinary users. Testers take whatever pieces of malware they can find and test AV products against them. They don't distinguish between different types of malware, sample prevalence or geographical distribution.
I'm sure you feel much safer knowing that your antivirus protects you against a worm that is distributed only through Chinese QQ messenger, or that very nasty banker attacking only Brazilian banks. Don't you?
To test False Positive rate, testers check number of files from popular download sites like CNET, Softpedia or PCWorld, or collected from European SMB companies. Of course, AV companies do the same thing and try to make sure they have no false positives on those sites. But if you're a small software dev and distribute your software using other means, or don't target SMB companies - well, bad luck. False Positive on your file doesn't influence test results. 🙂
It's a load of crap - but every company is still doing it because lots of potential users rely on such "tests" before buying antivirus. Some companies even cheat in tests.
Automation and big data.
Number of new malware and other crap these days is increasing exponentially. According to McAfee Quarterly Threat reports, ~4 million new malware samples appeared in the Q1 2009, ~7mil in the Q1 2012, ~32mil in Q1 2014 and ~48mil in Q1 2015.
Think about it. How can you process 48'000'000 samples?
The answer is simple - automation, automation and more automation. Malware classification is hugely automated process. Does the file look weird? Does it do weird things? Was it sent out in a spammy email? Is it encrypted to prevent automated analysis? Was it protected using stolen Themida? Do other antiviruses think it's bad? Game over, classified as bad!
Sure, sometimes some legitimate software gets classified as bad. In this scale, it's bound to happen.
If automation is not able to classify file, malware researchers will need to analyze it manually. This is where big data software, statistical models and cluster analysis come in. They alert researchers to traffic anomalies, suspiciously similar thousands of files and other "interesting" stuff. Files get prioritized based on prevalence, number of users affected and other factors. And, of course, the bigger the issue, the faster it gets attention from a real human being.
So, if your legitimate software is classified as bad and it affects all your 50 users - it's not because AV company hates you or your product. Really, they don't hate you. They just don't know you even exist. So, the sooner you let the AV company know about the problem, the sooner they will fix the issue.
But hiding your head in sand and saying "I don't have to time to play a cat and mouse game with anti-virus companies" will get you nowhere.
Are we all doomed?
Think about the points I just made. Your product needs to bring company money. You need to make a product Granma Millie can use. Your product needs to behave well in tests. Given the requirements, no matter how skilled the developers and researchers are, the end product will be...
Well, it will be just like the product you're getting now - dumbed-down, feature-bloated money-making piece of software that fares reasonably well in artificial tests.
You're living in the era of globalization and money-making corporations. Deal with it.