Facebook Pixel

Most businesses and organizations track website visitors, and they often use Google Analytics or a number of other third-party tools and apps to do this. But did you know you can track and analyze activity on your own terms, even for visits that happened long ago? How can this info help you leverage your website?

How Servers Track Website Visitors

Websites live on servers, and servers keep logs of visitor activity for the purposes of maintenance and troubleshooting. In fact, most websites are “roomies” on a single server that could contain hundreds of sites. This means servers can fill up fast, and thus platforms vary in how long server logs are kept, but if your site is on a shared or private server, you could already have a treasure trove of info.

Server logs can get large, so hosting companies have their own policies to keep things tidy. It’s worthwhile to ask their tech support what their log retention policy is, and if they’d be willing to extend it for you. Ask them where the logs are kept, and how to reach them. Then remind yourself to go get them on a regular basis, even if you are not yet ready to use them so you the have long-term data you’ll need to spot patterns and trends.

What Am I Looking At?

“Access” or “visitor” logs are just ordinary text files, and the server will generate 1 file per month, and that file includes all of the activity, no matter what the source. In fact, you can open them in a spreadsheet like Excel without any special magic. Since each line in the log represents a file or request on a given page, a single page visit could generate dozens of log entries. These are server-level records, anonymous, and exist regardless of the site or web app’s programming. Cookie information and things filled out in forms are not in these logs.

If you are on a platform, you will probably not be able to directly download and view the raw log files. Instead, your platform may “digest” this information into a dashboard or tool to help you see patterns in visitor activity–which pages got visits, when was your site the busiest, which pages where the most popular, etc. But if you can get the raw logs, you can make actionable finds beyond what their dashboard provides.

How to Leverage Your Server Logs

“So what do I do with all of this?” You ask. Before we get into this, let’s discuss what exactly is in the log. Whenever you visit a page, the server needs to fetch every image and file used on the page. It might be the stylesheets for making the site look nice, a bunch of thumbnails used for your online catalog, or JavaScript files (myfile.js, for example) that give your site special effects or cool features. Of course, you also see entries for the page itself.

No matter where your site lives, the logs contain the same info–they just might look different depending on the server. For the purpose of visitor tracking, you’ll want to pay attention to:

  • The visitor’s IP address. Think of this as a “caller id”. It may change over time if the same visitor comes back months later (internet carriers change this info occasionally for privacy), but this can point you to a physical neighborhood that you might be able to send direct mail to. With some separate lookup tools, you might be able to gather some demographic information (zip codes in relation to publicly available Census info), or at least track the same “id” as they move from page to page. Third-party tools like Google Analytics gives you anonymous “identifiers” for this info due to recent changes in privacy laws (advertisers often sell this to other parties, violating the visitor’s privacy). But this is “first party” info–it’s your site, and not a separate reporting system.
  • Browser Information: mobile vs desktop, Chrome vs Firefox, and more. This will tell you if your visitor was visiting from a mobile device or not.
  • Datestamp of the entry (either in UTC time or your server’s timezone, not the visitor’s).
  • Http code: You could see a variety of 3-digit codes here. Here’s a full list: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status. At a minimum, A 200 means the file was fetched without an issue. A 404 means the image or file wasn’t found. This means your visitor didn’t get the experience you intended.

I can get deeper into the nuances of each fragment, but since this an overview, let’s focus on the value of these logs for a site owner.

Panning for Gold

  • Missing files: Many CMS systems do not tell you if an image from your library is being used, so if someone deletes it, you may have a blog post from a year ago that is now missing that info. If your site refers to images, documents, or posts on other sites, they may no longer be around and you’d never know it. Turbine has done audits of logs to let clients know which pages and posts reference a particular image, document, or internal link. This helps ensure your site keeps looking professional with no broken or missing parts.
  • Locality: When we access the internet from home, our internet provider (our “ISP”) provides us with an IP address. Our mobile phone providers do this as well, though they give us a new one every time you start an online session. This IP address associates you with a neighborhood, but does not map to your specific address or phone number. It’s called a “dynamic” IP address because it changes once in a while in accordance with the privacy rules (hence the value of cookies, which survive these changes) of the ISP or carrier. On the other hand, if you are at work, your business probably has a “static IP” when you use your computer or their wifi, and it can be tracked to a specific address (but not the individual visitor). If you’ve sent out an email campaign to hundreds of offices that includes a link to a specific landing page, you can not only count how many people clicked on the link, but where they were when they did so. If you’re a business that relies on local marketing, this is quite helpful. You may have subscriptions that charge you hundreds per month for this information.
  • Volume shifts: Datestamps are always helpful for establishing a timeline–like seeing when your site is getting the most traffic. Take it with a grain of salt, but the timespan between a single IP address moving from page to page tells you how long they spent on the page–or how long it got left open by the visitor.
  • Return visitors: the average dashboard may tell you how many visitors returned to a particular page, but if you understand what the ratio is between new and returning visitors for a given page, you’ll have a better sense of how valuable your content is. A returning visitor is one who is getting closer to contacting you about your offerings.
  • Breadcrumbs: knowing how many visitors saw a page is helpful, but where did they come from, and where did they go afterwards? Understanding the visitor’s “journey” can help you cater to them better with more accurate automated content suggestions than just showing them articles in the same category. Logs are anonymous, but you may still be able to discern commonalities among the content the visitor saw by looking for patterns from the user in relation to the IP addresses and timestamps of each entry, which would help you know your visitors better.
  • Query strings: These are pieces of data that an be tacked onto a page link, and are often used to send someone to a specific product or landing page, pass along product, referral and tracking info, and more. Example: mypage.com/shopping?ProductId=464abc&campaign=summersale If you have an online catalog, your system probably gives each item an ID, and it’s likely in the query strings. If the link has a question mark, the query string is everything after it, and each name-value pair is separated by an ampersand (&). In the example, we see a productId and a campaign. Your system will see these values and show the visitor the product and the proper price. As you can see, this is a helpful way to see which products got a closer look, even if the visitor didn’t put anything in the cart and even if your cart is a widget from a third-party platform (like Shopify). If your software doesn’t notice these moments or encrypts them so they aren’t as legible as our example, you’re missing out out on valuable insights, especially in tandem with the breadcrumbs. Of course, Turbine would be glad to help give you that kind of leverage.
  • Improved Visitor Tracking: If you think your dashboard’s visitor count seems a bit high, you may be right. Analytics often includes your own visits, and may even include you logging into your own admin panel. With smarter code in your reports, these can be accounted for.

This is not an exhaustive list, but as you can see, your visitor logs are a great resource that won’t cost you anything to obtain. If your current platform or tools don’t give you a way to leverage them, or you’d like to kick around some ideas about how your business could leverage your logs, please feel free to contact us.