EJM Designs Limited Blog

Thursday, May 28, 2009

DIY: XML Sitemap Submission and Validation

Tuesday I went through the process of creating an XML sitemap and demystifying autodiscovery. Today, as promised, we will delve into getting those sitemaps into the figurative hands of the search engines.

"Hey," you might say, "but if I've created my sitemap and bolstered it with autodiscovery, can't Google find it on their own?" Of course. But since no one except Google knows their own schedule, would you prefer to rely on a blind run or say "Hey Google! Over here!"

And while I am talking about Google in generalized terms, we will be covering validation and submission with the Big 4: Google, Yahoo, Microsoft Live, and Ask.

The Setup

The basic setup for most of these is the same: you will validate that you have control over the site and submit a "feed" or sitemap. The validation process is either adding a validation meta tag to your home page or dropping a validation file in your root directory. Whether a tag or file, it's just an alphanumeric validation string so Google et. al. can prove that you own or have direct access to the site. The secondary part is telling them where your site is.


You will need a Google Account for this one. Once you have that, go to http://www.google.com/webmasters/tools. Add a site. Validate your site by adding a meta tag to your home page or uploading a file to your root directory. Post the link to your sitemap. You're done!


You will need a Yahoo login for this one as well. Go to https://siteexplorer.search.yahoo.com/mysites. Just like Google, you're going to validate with a file or meta tag and post your XML sitemap "feed."

Microsoft Live

Again, you'll need a Microsoft login, hotmail or the like, to get into this site. Go to http://webmaster.live.com/. Rinse and repeat: upload a file to your site for validation and list your sitemap. Blammo!

No work yet on how bing.com will handle the issue.


This one's easy. Simply put this string into your browser:

http://submissions.ask.com/ping?sitemap=http%3A//www.the URL of your sitemap here.xml

And that will ping the site to tell it your sitemap is ready for viewing and crawling.

Is sitemap submission a necessity? No, it is not; eventually the search engines will pick up on your sitemap or autodiscovery and get to crawling. But if you're a "cover all bases" and "ASAP" person like I am, go ahead and spend the half hour it takes to knock out all four of these and go to sleep a little easier.

I'm happy to take any questions or comments in, well, the comments area.

Tuesday, May 26, 2009

DIY: XML Sitemap and Autodiscovery

A sitemap has always been, traditionally, an HTML page that lays out your website's page hierarchy. Simple enough: use the unordered list tagset and categorize those pages. And for more SEO love, put a link to that page in your footer. Tasty!

But then it got a little more complicated, with both sitemaps and SEO. But - don't worry - not that much.

You may have heard the term "XML sitemap" and perhaps even the term "autodiscovery." And you thought "WTF? How do I do that?" It's really not as hard as you think. And if you have FTP access to your site, you can.

The XML Sitemap

A couple years ago (has it been that long?), Google spearheaded the idea of an XML Sitemap which is basically an XML file sitting at your root directory called, obviously enough, "sitemap.xml" It was soon adopted by MSN Search (now Microsoft Live!), Yahoo, and Ask. All the parameters and guidelines can be found at sitemaps.org, but it boils down to this: an XML sitemap that lists all the pages on your site, the last change date, the frequency of change, and the importance in a 0.1 to 1.0 scale (1.0 highest).

Creating a sitemap.xml file from scratch is a daunting prospect, to say the least.

So how about a tool, Eric? Of course. The free software I personally use to create these sitemaps is GSiteCrawler that you can download here.

The program is intuitive: you enter your URL, click some boxes specifying what you'd like logged, and let it run. (NOTE: this tool is for a live site; grabbing from a staging site will only confuse the robots/spiders and potentially be detrimental in a "duplicate content" kind of way.)

Generate the sitemap and save it on your system. Take a look at it in a text editor to make sure everything looks good, and upload it to the website, where the home page is.


Very soon after the sitemap.xml protocol was adopted, so was autodiscovery. It might be a fancy word to flaunt in front of people, but it's really quite basic. In your directory - where your home page is, where your sitemap.xml now is, should exist a "robots.txt" file. That file should contain information about what directories in your site should not be followed (disallowed).

Autodiscovery is a single additional line in that file that reads like this:

sitemap: http://www.yoursite.com/sitemap.xml

That's it!

So why's it called "autodiscovery?" Simple enough: when the robots/spiders visit your site, the first thing they are "supposed to" check is the "robots.txt" file so they know what they don't have to bother with, saving them time. This additional line of code increased their efficiency by immediately directing them to the exact file they can use to guide them through your site.

Fancy words, easy results.

"But Eric, I've got my XML sitemap and my autodiscovery in place, so how do I make sure Google, Yahoo, etc. know that it's all cool and they need to take a look at my site again?"

Well, sir/madam, that would be tomorrow's post: DIY: How to Let the Engines Know You're There.

Tune on in.

Questions or suggestions always happy in the comments.

Friday, May 22, 2009

Cleaning Out Your Twitter: The No Return Love Split

You've done your due dilligence and searching and following and you feel like you need to do a little cleanup of your Twitter account. And by cleanup, I mean removing followers.

Wha-what? Why would anyone want to remove followers? I'll go into that in another post, but suffice it to say it's like a breakup: they're hounding you, they've changed, they're not what you thought, or they're ignoring you. It's time for a clean split.

Let's take that last one. You are being ignored. By that, I mean you have followed someone and they are simply not following you back. It's been weeks or months even and part of the Twitter experience is giving love back. (note: if you're following celebrities and they are adding value to your daily tweet reading, they're - by definition - exempt from this)

So how can you tell if someone's not following you, ignoring your original follow notice? It's quite simple:
  • Log into your Twitter account through the twitter.com website.

  • On your home page, click the "following" link to view people who you are following.

  • Anyone who you cannot send a direct message to is not following you.

  • Clean out that trash.

Oh, what? No. Hold on, now. Calm down. I know. I know. Numbers are important to you and you following many people gives your life meaning. But wouldn't you rather be spending time following people who are following you? Isn't that more valuable? More meaningful?

Hey, don't look at it as a severing of a relationship. They weren't following you, right? Look at it as cleaning out the garage or the fridge, as nature culls a herd.

The definition of winnow is to separate the grain from the chaff. Get rid of that chaff. You'll have a leaner, more powerful Twitter account for it.

Wednesday, May 20, 2009

Chat Widgets for Your Website

Because not everyone has Trillian running on their system with 9 different accounts funneled into it (seriously, I do), I wanted to make the time investment to make sure people could chat with me whenever they wanted to. I have succeeded on the EJM Designs Limited Chat page.

The bonus to providing this service is that while you need an active AIM, Google, or Y!M login, your website visitor does not.


Visit AIM's WIMZI page and sign in next to the orange question mark in the grey area above the options section (or it will not give you the source code). Then fill out your specs and click the "Create it" button to get the source code. Paste into your website.

Mine looks like this:

AIM Chat Widget

Pluses: easy setup, shows I'm connected when using Trillian. No minuses so far. Must stick to my webpage to chat, but with tabs today (I have 10 open now), not a biggie.

Google Talk Widget

Log into your Google Account and head on over to the Google Talk Widget Page. Set your Preferences, update the code, copy, paste into your website, and voila! You get this:

Google Talk Widget

Visitor will click on the link which will open a separate chat window. Pluses: Easy setup, simple styling, and even though stuck on the page isn't a detraction, I like the separate window. Flexibility. And ALT-Tab is easier than CTRL-Tab. Go about your business. Chat with me. Minuses: Because Trillian does not gobble up Google Talk, I need to be logged into this account to hear from you on my site. That happens for about 25% of the day.

Yahoo Messenger Pingbox

Go to the Pingbox studio, enter your parameters, copy and paste. Looks like this:

Yahoo Messenger Widget

Pluses: Easy setup. Minuses: FAIL. When you get to the second page of setup, every time you change the dimensions, the object id attribute changes as well, which wouldn't be an issue at all if the characters didn't change to include - 75% of the time - Chinese and other definitively unicode characters that show up as boxes and don't paste as anything other than question marks.

(sidenote: tweaks to the widget size have a lower limit of 180x320 and will repeatedly reset in the interface despite no stated limits, though change in pasted code to 160 width shows up fine.)

And once you tease it enough to hand you basic ascii, logging in through Trillian or logging in through Yahoo Mail still shows you are not logged in on the widget. Bad widget!

My only guess on this one is that you actually need to be logged in through Yahoo Messenger, though I'm not sure why as Trillian's login to Yahoo always upsets the messenger-through-mail. And because I won't download Yahoo Messenger to test this theory...FAIL!

Why do you think I've got Trillian installed? 9 IM accounts. Duh. Removed from page.

Snark aside, and Yahoo aside, the Google and AIM widgets are lovely and it's only because of the Google login that I must go with AIM FTW!

Let me know your thoughts on whether website widgets are a good idea or a pain. Should we allow anyone to contact us anywhere anytime? Ups and downs?

Palm Pre - Geeking Out for June 6th

I love gadgets of all kind, especially when it comes to high technology. But I've always been reasonable about it: while early adopters get it first, they also end up with the most headaches and bugs and problems.

I'm sitting with and currently happy with Sprint, so iPhone has never been an option. And then I saw this previewed:

Palm Pre
Palm Pre

Tuesday the Wall Street Journal had the official scoop: release on June 6th.

But what's the buzz? Slick look, lean new OS, sick docking system, $200 with contract, and everyone drooling over the upcoming iPhone says it's going to suck.

I do believe that means it's for me. Even if it means getting in line before the store opens.

Have you heard anything about this phone? Should I be considering something else? What is your personal gadget pron? Let me know in comments.

Friday, May 15, 2009

Real Life Twitter

From College Humor:


What Are Your Feelings on a Face?

My current avatar/image thing for all my social media is a lovely, lower-case "e" with a semi-transparent, desaturated version of the main graphic feature on my website. But it's impersonal, and I plan to change that in the next week.

"How do you plan to change it, Eric?" you ask. By doing what most people do for their images: use a picture of my face.

My question today is: What are your feelings about the avatars people use? Does seeing a smiling face imbue that person with an aura of trustworthiness? Does a logo or non-face do the opposite, as though a wall is up? And - finally - what are your thoughts on someone's baby or pet plugged in as their avatar?

I'll post on all these things and more as soon as I get my own mug up there, but wanted to know what you were feeling.

Wednesday, May 13, 2009

Google Chrome Goes to Teevee

The sad thing about keeping 8-15 tabs open at a time is that I sometimes forget the reference from whence they came. So, to that person: thanks.

This is a Japanese Google Chrome Ad. Apparently their first commercial?

The blatant stop-motion format is as 70/80's nostalgic as the Arkanoid theme is 80/90's while introducing an "aughts" browser.

Brilliant or fail?

The Techno-Drudge Myth-Fraud

I'm not purporting to be staging a coup, though if I were I would do it upon a humongous ottoman as it would be more comfortable. I'm not making a political issue over Matt Drudge's site content. Very simply, I have seen what I would call fraud and need to pull the aside the curtain for a moment of clarity on what numbers you may run across actually mean.

And yes, this is important. Because it's news media and because it's technology (or technology hacked). This is deceptive practice on the main stage.

What's the issue?
Every so often, Matt Drudge's The Drudge Report reports on how much traffic he's received, generally with some tasty braggadocio. The latest report (PDF) puts him just under Google News, CNN, and the Weather Channel.

The first clue that something's rotten in Denmark would be that Drudge's 1+ (sometimes a secondary "developing") page is pulling 2.14% traffic compared to CNN's many, many pages taking in only 3.15%. (NOTE: These percentages may not appear large, but it is market share of ALL news and Media websites that exist.)

This is relatively preposterous when you look at the number of potential visitable pages: 1 vs. hundreds. Is the single page site really that popular?

In a word: NO.

What's a Visit?
The report that's being touted is from Hitwise, and they state in the first paragraph: "Note - the Hitwise data featured is based on US market share of visits..."

A visit is defined as every time someone accesses the site. Sometimes it is separated into new (or unique) visitors and return visitors. In this case, every time someone accesses www.drudgereport.com, it is counted as a visit.

So what's so hinkey?
If you've ever visited The Drudge Report, if you were there for more than 3 minutes, you may have noticed that the page will reload by itself. Every time it does that, it should hold to reason that, because you are again accessing the site without clicking on another link on the site, it tags another visit. Leave it open in a tab for a couple hours while you mean to get to it? You get the picture.

(As a comparison, every time you visit CNN.com, if you click 50 different links to read those stories, it still counts as a single visit.)

Why does Drudge's site refresh automatically like that? It's a simple bit of JavaScript that goes like this:

var timer = setInterval("autoRefresh()", 1000 * 60 * 3);
function autoRefresh(){self.location.reload(true);}

This means that if you're carefully reading through the headlines and it takes you 15 minutes, you are registering 5 visits even though you've only visited the site once.

What else?
The Drudge Report website is a single page with dozens of links, almost all of which link to stories on other sites. None of the link text uses the target="_blank" anchor attribute, meaning when you click a story link, it takes you to another site. But you would like to see what other stories they have on the site, so when you are done reading that story, you click the back button. You've just logged another visit.

It is impossible to calculate the actual number of people accessing The Drudge Report in any given month without more data, like unique visitors. But the data that is being pushed as proof of popularity is deviously misleading as the pairing of the site structure auto-reload and necessity of the Back button lead to an enormous number of Visits by its very nature. Caveat Emptor.

Again, this is not a political opinion, but information based on web analytics knowledge and source code research. I visit The Drudge Report several times a week. But I know what's going on behind the scenes. I generally take a quick scan, CTRL-click the stories to open them in separate tabs, and close out of the main one.

And I just thought you should know.

Any ideas or thoughts are always welcome in the comments.

Saturday, May 9, 2009

Yahoo Registration Fail

In the ever-expanding world of fine-tuning EJM Designs Limited, I recently switched my AIM and Yahoo IM names to fall more in line with branding. The old one was "estragon420," and while 420 is there because it's my wedding anniversary, I've received more than one comment about it being a marijuana reference. And how professional is that?

AIM is a stinker because it doesn't allow any non-alpha-numeric characters, so I couldn't have a straight "eric.ejmdesigns" for all my IM parts.

But the story here was a delicious fail from Yahoo:

Yahoo registration fail

Note to Yahoo's dev crew: mm/dd/yyyy is a very standard format for dates. Unfortunately, "mm" is generally - no, always - a reference to the two digit representation for the month. Not the whole word. Perhaps QA was on vacation that week?

Goofs aside, you can view my new AIM/YIM in the right column. Drop me a line sometime, even just to say Hi.

Thursday, May 7, 2009

10 Reasons I Will Not Follow Back on Twitter

(or Basic Behaviors to Avoid)

EJM Designs Twitter

So you've seen my Twitter and you follow me, which sends me an email alert and gives me a chance to follow you back.

Some people have auto-follows set up so if you follow them, they follow you back automatically. Some even have robo-auto-followback direct messages, notably generic notes usually declaring how your Twitter account rocks. I can't stand that, so I make sure I click every profile, and if the person is relevant and I follow them, sometimes sending a brief but personal thank you.

Wha-what? You don't follow everyone back?

No. For me, Twitter is about developing online relationships and connecting to people and organizations that provide value. Not every one of the items listed here is - by itself - a deal breaker. But all are serious considerations.

  1. I do not know you and you have protected your updates
    I understand the need for privacy, but you've got to put yourself out there to get a little back. Your actual tweets are a good portion of why I follow.

  2. You have no location or bio information
    I connect with two types of people aside from personal friends: local Cincinnati (general Ohio, NKY too) interest and worldwide for tech, web, and SEO folks. If I do not know where you are or what you do via profile, I'm immediately turned off. And stop with the "Worldwide" or "Earth" for location. It's not quaint; it's trite.

  3. You have no (or very few) updates
    It's very much about the tweets. If you are not tweeting, you are not adding value, and I can tell very little about you. Get at least 10 updates posted before following.

  4. The whole of your profile page (20 tweets) is from today
    Other side of the post coin from absentee. If you're tweeting 50 times a day, you are not adding value, but clutter. Obviously newsfeed accounts and the like are exempt.

  5. The majority of your tweets openly (or secretly) link to your website
    If you tweet a couple times a day and almost every time link to www.yoursite.com, this is shameless self-promotion and you need to stop. You can still fix it. If you are pushing www.yoursite.com through tinyurl.com or the like so it looks different every time, but still shameless self promotion, you've forced me to click at least twice to find this out. You are dead to me.

  6. Your Twitter account is foreign language or very poor English
    I haven't even tackled French as a refresher on Rosetta stone. I am a writer with an English degree. I mean no offense, but if you have poor command of English, don't tweet in English. If your account is in any other language, I simply cannot read it.

  7. The majority of your tweets are @ responses or RTs
    I'm all for having a conversation and giving someone RT props when it's good content, but I'd like to see what value you add. Chances are, based on location or background, we follow some of the same people. Show me what you can do.

  8. You have 40,000 followers
    The drop in the bucket. You probably ruined your account with some auto-follow pyramid scheme and if 40K with no value makes you feel better, that's fine. That's how you drive your Twitterbus. I drive mine around you. Exemptions to some genuinely famous people.

  9. You offer me nothing
    That may sound snooty, so allow me to explain with several examples: you are tweeting about your sandwiches and bowel movements; you are promoting a one-shot contest; you are promoting a product I have no interest in purchasing; you are selling real estate...anywhere; you're tweeting solely about logging or knitting or any one of a hundred things that may be interesting but I am not (currently) interested in.

  10. You are Ashton Kutcher
    Bwahaha. No, really. I won't. That was tired 3 seconds after the billboards.

(And while you're at it, might want to make sure your background isn't cut off.)

It really comes down to what works in most of life: moderation. Don't get carried away on any one angle, mix it up a little bit, and you're building a great account. Reward yourself with a Tweetup. Shake some hands. And smile.

Is this a reasonable list? What are your thoughts? What are your Twitter turn-offs?

Wednesday, May 6, 2009

Blood From a Stone?

New York Post reports today:
Media companies have had a hard time getting consumers to pay for content online, but Rupert Murdoch's News Corp. is attempting to leverage its worldwide news-gathering operation to get them to do just that.

In one of the most ambitious online undertakings by a media outfit, News Corp. has assembled a team of executives to devise a system to charge for content on the Web.

In related news, my dog and two cats have assembled to figure out cold fusion.

(and why is "Web" capitalized?)