Jump to content

What Statistics Should I Believe?


Recommended Posts

There is a wide variation in statistics for visits to our IP Community.

According to webalizer, provided by my Hostgator account, I had 151,249 visits in June. 5041 per day.

The "Online Users" page consistently shows around 30 or so visitors every 30 minutes. Which can't be right according to webalizer.

Then, according to Alexa, allegedly verified because I joined and added their code to all pages, I only had 11,973 visits in June, not 151,249 visits.

Which should I believe?

As a side note, I have my php.ini set to record a session as every 60 minutes. My "Online Users" page with Invision Power only seems to go to a maximum of 30 minutes. Isn't this supposed to go according to the session setting in php.ini?

Link to comment
Share on other sites

The tools based on server log files deliver the most accurate data, since they are based on what the server actually delivers.  External services embedded in the site (and run on the client-side) are not very reliable, since there are many scenarios where the tracking just won’t work. 

Link to comment
Share on other sites

  • 10 months later...

Thanks for taking the time to reply. I'm posting again because the problem is getting even worse.

And sorry to beat a dead horse, but this is an existential question for me. I have to attract advertisers with statistics, and for example, recently a potential advertiser said he had no interest in advertising on my site because of the Alexa and Google Analytics stats (which he asked for and I shared). And with the numbers they have, I can definitely understand!

If Alexa and Google Analytics are correct, or even somewhere close, I should just go out of business.

If Webalizer is more accurate, then I should keep going. But I need to make a decision about what to do.

And I need some way of explaining how I am getting far more traffic than Alexa and Google say I am. Since I recently changed to a secure server, the range in the statistics has become far greater.

Alexa (visits): 7,591
Google Analytics (sessions): 7,523
Webalizer (visits): 224,883

Question 1: Is it even possible I am getting 217,000 separate bot visits every 28 days? That would be almost thirty times the number of real visits! :o

Question 2: Is there ANY way I can measure how many actual visits I am getting per day,  per month, etc.?

Question 3: I recently changed over to a secure server (https://) I realize that is like having a completely new domain, and Webalizer is counting visits on both the insecure and the secure domains:

http://www.turkeycentral.com (visits): 49,759
https://www.turkeycentral.com (visits): 175,124

Would the correct way to measure total visits be to add the two numbers together?

 

Link to comment
Share on other sites

You can read about the Webalizer terms and how they are calculated here: http://www.webalizer.org/webalizer_help.html

But as I said before: for things like page views, a server-based measurement will of course give the best results. External JS services can easily be blocked for example. 

Alexa’s numbers cannot be trusted at all. They are pure estimates based on a very small sample size. They are good for checking trends and comparisons with your competition, but not to give any absolute figures about traffic. 

Link to comment
Share on other sites

Alexa provides a "premium" analytics service which works similar to Google Analytics. But otherwise, yeah. They're complete wild guesstimates.

The number of users that actually block Javascript is exceedingly small, so that's really not much of a concern.

Google Analytics can provide so much more information that just raw lag parsers can. Webalizer is pretty archaic in my opinion, and it really only displays analytics on raw page hits and so on. It doesn't factor in bots, crawlers, and etc. - it doesn't provide reliable information on repeat visitors, performance statistics, user engagement, and so on; all of which require client side logic to track.

The reason you see so many visits from Webalizer is because it doesn't factor any of these things in. It likely includes hits from crawlers and bots, it isn't able to accurately recognize repeat visitors, etc.

I'm not saying it's a bad tool, but it's really a bit dubious. 90% of your users aren't blocking Javascript. In reality, less than 1% of the internet probably does. It's an extreme minority.

Link to comment
Share on other sites

2 minutes ago, Makoto said:

It's an extreme minority.

Depends on where you are. Here in Europe people are super cautious about their privacy and blocking external services by default (not necessarily JavaScript completely) is quite common unfortunately. 

Link to comment
Share on other sites

¯\_(ツ)_/¯

When I still used Piwik, it had a fallback method for tracking users with Javascript disabled, and it was like ~0.7% of my users last time I checked. I don't have access to the data anymore, unfortunately, so I can't cite the exact statistics.

Link to comment
Share on other sites

Google analytics for most audiences is on par. If you would like to take out the JavaScript concern I also use Clicky who have a nonscript portion that attacks that. However, I find Clicky and Google Analytics about the same.

Link to comment
Share on other sites

The challenge with any analytics package is that each has limits/flaws that cause it to mis-calculate traffic.

Object-based analytics programs such as Alexa and Google Analytics count on a javascript being delivered to measure traffic.  However a number of devices and systems don't follow JS.  For example, any traffic generated by a search engine would not be caught with an object-based analytics program because search engines never download and execute CSS and JS. 

Server based analytics don't necessarily know how to calculate page views correctly because of techniques such as AJAX.  Webalizer for example has been around for nearly 20 years and is based on the same general framework it has from it's initial release.  

It looks like Alexa and Google are pretty close to agreement on your traffic.  This means one of two things is happening:

1) Webalizer is wrong/misconfigured.  (I don't think this is necessarily the case if it's setup by the host itself.) -- Also...  make sure you're looking at the right variable.  Hits vs Page Views vs Unique Visitors, etc.  They're all very different things!

2) You have a lot of bots crawling your sites.  Remember, bots (especially search bots) don't process JS so your Google Analytics and Alexa would never see them.  

Link to comment
Share on other sites

5 hours ago, Randy Calvert said:

Object-based analytics programs such as Alexa and Google Analytics count on a javascript being delivered to measure traffic.

 

Which really, 99% of the internet has Javascript enabled. The only type of people who consciously disable Javascript are the ultra-paranoid. Normal people do not browse the internet with Javascript disabled. In-fact, most modern websites will just completely break with Javascript disabled. I don't think this is a market segment you should really need to care about.

5 hours ago, Randy Calvert said:

For example, any traffic generated by a search engine would not be caught with an object-based analytics program because search engines never download and execute CSS and JS. 

 

Right, and this is the kind of traffic you do not want to be tracking. It's not real traffic. It's automated/fake traffic that you should not acknowledge. Google Analytics intentionally disregards traffic from web crawlers and spiders, because it's again, not real traffic. Just because a website is heavily crawled does not make it popular, nor does it mean you have a lot of visitors, so it's not a statistic you should care about at all.

Link to comment
Share on other sites

1 minute ago, Makoto said:

Which really, 99% of the internet has Javascript disabled. The only type of people who consciously disable Javascript are the ultra-paranoid. Normal people do not browse the internet with Javascript disabled. In-fact, most modern websites will just completely break with Javascript disabled. I don't think this is a market segment you should really need to care about.

Right, and this is the kind of traffic you do not want to be tracking. It's not real traffic. It's automated/fake traffic that you should not acknowledge. Google Analytics intentionally disregards traffic from web crawlers and spiders, because it's again, not real traffic. Just because a website is heavily crawled does not make it popular, nor does it mean you have a lot of visitors, so it's not a statistic you should care about at all.

I would not say you don't want to track it or be aware of it at all.  For example, if your host is saying you are using too much server resources ...  if 90% of your traffic is coming from a bot, it's still consuming your resources and would have been the biggest contributor to the problem.  (I helped someone recently where this was the case where 7 out of 10 requests did not come from a human and the person was being told by their host they were abusing resources.)

If you're tracking it from a perspective of eyeballs, or monetizing the forum...  you're absolutely correct.  Bot traffic does not matter.  It's not going to help pay the bills.  

Link to comment
Share on other sites

That's not what tools like Google Analytics and such are for though. They're for tracking real traffic and real activity.

Google Webmaster Tools provides valuable insights on how much Google's crawlers are indexing your site, and even allows you to fine tune the crawl rate right from their own interface. So you don't need server side log parsing tools to narrow that down if this is a concern.

There are also more ideal manners of tracking down abusive traffic from bots and web crawlers by utilizing various command-line utilities. Studying traffic after the fact likely won't help much in that regard.

So if abusive bots/web crawlers are your concern, I really think this would be using the wrong tool for the job here.

Link to comment
Share on other sites

4 hours ago, Makoto said:

Which really, 99% of the internet has Javascript disabled. The only type of people who consciously disable Javascript are the ultra-paranoid. Normal people do not browse the internet with Javascript disabled. In-fact, most modern websites will just completely break with Javascript disabled. I don't think this is a market segment you should really need to care about.

Yeah most dont disable JS but an increasing amount use ad blockers, if you use uBlock Origin for example Google Analytics is blocked by default when browsing. Its just something to be aware of. It also blocks piwik as well but its possible to tweak the piwik install to get around that.

Link to comment
Share on other sites

Wow. Thanks to all for the replies.

On 5/20/2017 at 8:08 PM, ZeroHour said:

Just to say a lot of ad blockers now started blocking Google Analytics FYI.

A couple of years ago I noticed a sharp drop in my Google Adsense income, I am assuming this is because of ad blockers (most of my traffic comes from Europe and the UK). Also, my Alexa numbers and ranking for some reason dropped through the floor. All the while the visits were increasing steadily according to Webalizer.

Can anyone recommend a server-side statistics program which can accurately measure real visits and weed out bots? Might that be the best way to measure traffic if the Google Analytics and Alexa stats are faulty?

Link to comment
Share on other sites

Piwik with tweaks to get around ad blockers really.

Alexa gets its data from buying from ISP's which doesnt happen much now and Alexa toolbar and who can even remember that back in the day of IE so its pretty worthless now.

Link to comment
Share on other sites

System is basically anything other than the core applications (like forum, gallery, cms, etc). 

I saw something similar on my sites and when I dug into the logs, it was search engines crawling the member list and profiles. Remember every member has a profile and as bots crawl it, that can cause a lot of page views if you have a lot of members. 

Link to comment
Share on other sites

I see. Perhaps that's why my visit numbers have gone through the roof. In my robots.txt, I removed the /members/ block for bots.

However, doesn't that still go by sessions, that is, one bot visit in one hour constitutes a single visit?

Thanks again to all who have replied to help me figure out what's going on. :)

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...