Sitebulb Version 1
Post-beta, Sitebulb spent circa 6 months in a 'version 1.xx' incarnation, and these notes chart this period of rapid development and experimentation. The release notes are similarly explorational, marrying a new found expression though imagery with an eclectic use of language.
Released on 19th March 2018 (hotfix version)
#1 Fixed multiple issues with custom headers
The custom header setting (in Advanced Settings) was not working properly, so we fixed that. Also even in it's broken state it was not persisting when you went to 'Pause & Update Settings', or went to do a re-audit, so you had to enter the data again. And on top of all that, we'd left a typo in there ('customer' instead of 'custom'). FML.
#2 'Stop XML Sitemaps' now actually stops
We discovered a very frustrating bug that meant that if you decided to stop the crawl early and build the reports, Sitebulb could get stuck crawling sitemaps. If you hit 'Stop XML Sitemaps', it would just skip onto the next sitemap in the scheduler, instead of actually finishing the audit and building the reports. This is fine if you only have one sitemap, but if you have 5000 it's...less good.
Released on 7th February 2018
#1 Improved crawling speed
Everyone likes FAST. Fast is awesome. My little boy is constantly debating who is faster: Sonic, The Flash, Usain Bolt or Catboy (from PJ Masks). My money's on Catboy.
We've made Sitebulb's task scheduling more efficient, which has made it crawl faster. You can safely file this one under 'performance enhancements.'
#2 View folder link
Sitebulb writes data to disk, which means there's a directory on your local hard drive which contains all the data and export files. Occasionally, folk come to us asking how to find these directories. And to be honest, trying to explain it is a massive pain in't arse. So we just made a little button on the Audit Overview that magically takes you there.
#1 XML Sitemap Report pulling in Sitemaps from other Audits
A number of users noticed this peculiar issue, where Sitebulb would display the XML Sitemap details for a completely different website, when you looked at the XML Sitemap report.
This was an issue with the user interface, rather than the crawler itself, and has now been rectified.
#2 Truncated URLs and anchor text on URL Details page
We noticed that on some websites, the URLs and/or the anchor text was too long to fit in the allotted space we had allocated for it, causing them to tumble horizontally off the side of the page, rendering the page nigh on useless.
Enter, that versatile and truly magnificent typographical cliffhanger itself... the ellipsis.*
*Used in this instance merely to truncate. A waste if you ask me.
#3 Disable cookies now works properly
Sitebulb has cookies enabled by default, but you can turn them off in the Advanced Settings. Well you can now anyway - if you tried to do it before today you may have noticed that it actually did not turn them off. Whoops.
#4 Tiny typo fixed in audit progress text
Annoyingly, no users reported this glaring error, where the report building copy for 'Exporting Multiple <h1> tags' was showing the HTML:
Fortunately, I spotted it. Lucky someone is paying attention.
#5 HTML also in Compare Audits export
Just like the issue above that no one could be bothered to notice, we also discovered that HTML had ended up in the export from the Compare Audits function.
More shit and bollocks!
#6 Time of audit changed to only display hours:minutes
As sticklers for precision, we've always liked to display the 'time of audit' to the nearest second. Upon hearing industry rumours about one member of the Sitebulb team and his 'obnoxious pedantry', we've made the controversial decision that this degree of accuracy is wasted on, and quite frankly, unappreciated by, the entire Sitebulb community.
#7 Typo on Internal URLs Hint
There was a typo on one of the Internal URLs Hints, where the word 'usual' should have read 'usually'. Disappointingly slack on my part.
#8 Pagination was broken on the Dashboard
Curiously, the pagination links at the bottom of the Dashboard stopped working, so you could not browse through old projects. This was particularly annoying for users such as myself, having 320 Projects in my list.
#9 The Hint: '<head> contains invalid HTML elements' was firing for commented out elements
One of our most popular Hints* was giving false positives when encountering commented out elements. Some schoolboy shit right there.
*This Hint scored in the top 3 Hints of all time, based on a survey of 3 people.
#10 Pause and update settings was forgetting the crawl limit
If you set a crawl limit (e.g. 100,000 URLs) before crawling a site, then mid-crawl decided to pause and update the crawl setting, Sitebulb would very annoyingly forget the crawl limit and just keep crawling beyond the limit.
#11 URL Details view showing URL listed in the same sitemap multiple time
If you clicked through to the URL Details page, and had XML Sitemaps turned on, the same sitemap would be listed multiple times. This led to a number of confused users, asking irritating support questions such as 'what am I doing wrong?' and 'why is this happening to me??' and 'what did I do to deserve this?'
#12 Exporting large audits was not always working
Very few people will have seen this, but if you tried to export very large audits the software would sometimes throw an error.
Released on 9th January 2018 (hotfix version)
#1 Meta description length default changed
Since the beginning of time, every SEO on the planet has been conditioned to write all meta descriptions in the range: 140 characters < x < 160 characters. It has got to the stage where it is impossible to even write a regular sentence that falls outside these strict bounds (check these ones and see for yourself).
But then Google came along and decided to change everything – 'meta descriptions can now go up to 320 characters', they said – which I’m sure you’ll agree makes for preposterously long, awkward, unwieldy sentences that just go on and on and feel completely unnatural to both reader and writer, present company included.
So, Sitebulb has followed suit, changing the default 'too long' setting for meta descriptions to now be > 320 characters, following Danny Sullivan's tweet (Please note: this affects only the default setting, which was previously set at 170 - if you have already overwritten the default, yours will not change to 320).
#1 Sitebulb now correctly opening on startup
Occasionally, users would end up in a situation where Sitebulb would not always open up properly first time around, meaning they had to go and start the app up for a second time. If this never happened to you, you're in luck, but let's all just agree that it sounds pretty fucking frustrating. We've resolved it by completely rebuilding the startup procedure, for both Windows and Mac.
Released on 1st December 2017
#1 We sold our souls to the Apple
When we first started building Sitebulb, it was our single guiding vision to have one universal user interface that looked exactly the same on both Windows and Mac. This was the one fundamental principle we knew we must stick by. We spent hundreds of hours perfecting the design - everything from the stunning report graphs to the delightful little X button in the top right hand corner.
Windows users positively fawned over Sitebulb's exquisite design.
Mac users, however... well, that was a different story. "We really love it guys", they'd say, "but..." (there's always a but). "I can't get enough of those graphs, but...", "Those crawl maps - magnificent! But..."
"...BUT WHERE ARE MY TRAFFIC LIGHTS?????"
Day in, day out. Relentless. 400 emails a day about it. They'd tweet me, Facebook messenger me, Slack me (is that a verb yet?), they'd send me SMS messages like they were actually my friends. One guy hand delivered a letter.
I'm ashamed to say, we gave in. Here. Here are your bloody traffic lights:
Yes, this will make half our users happy. But we have paid a toll, a heavy toll indeed. We have abandoned the very principles we lived by, the values that stood at our very core. And for what? Commercialism. Shame! Shame! Shame!
Peer pressure can do this to a man.
You take care out there, kids, it's a tough world.
#2 Added user agent and language data to preferred domain check
You may come across scenarios during your day-to-day technical SEOing, where the preferred domain results do not match your expectations. 'Redirecting to m.example.com, but why?!' you may be known to cry.
Well, Sitebulb now gives you more details of the HTTP request used - the User Agent and the Accept Language - both of which may impact how the site handles the request. You can configure these by clicking change settings, which will take you to the Global Settings page, where you can adjust both of these things.
#3 Added Accept Language to configurable Global Settings
Following #2 above, we couldn't very well tell you the language but not allow you to change it. So now you can, from the Settings -> Crawler options. The first time you use the software, it will auto-detect you language settings and set the default language accordingly (but it won't do it again after that).
#4 On the All Hints page, changed the 'All Hint Data' export to 'Export All Data'
The old export was near useless, with about a million worksheets you needed to tab through. The new export is actually a collection of all the individual hint exports, along with all the reports for each section of the audit. Way more usable.
#5 'Crawl alternates' is no longer on by default
In the 1.6.0 update, we added the ability to crawl alternate URLs, and set this as 'on' by default. Turns out, this is pretty fucking annoying. WordPress sites, for example, will spew out oembed alternate links for every page on the site. So from one crawl to the next, you would see page totals doubling. This feature is best reserved for when you actually need it, so going forwards you'll need to turn it on in the Advanced Settings.
#1 Reduced timeouts to preferred domain checks
Some users reported an issue with the little check Sitebulb does when setting up a new Project, to determine the preferred domain, where the check would take a long time (> 30 seconds). This would happen when one (or more) of the 4 options was completely inaccessible, and was timing out. We've reduced the timeouts on this check, so these edge cases will still take a little longer than normal, but only a few seconds in total.
#2 Finally resolved occasional issues with exporting/importing
As we've been improving Sitebulb, we've been making it faster wherever we can. Turns out we made it too fast in some places, so the export/import overlay could not catch up. We added a slight delay to the export building process, which has resolved this issue.
#3 Split out Hint exports to be indexable/non-indexable
We recently added exports for every single Hint, but in the individual exports they would include both indexable and non-indexable URLs. Doh! We've split them out now to only include the 'right' stuff.
#4 Fixed domain resolution for South African TLDs
If you entered a website with a South African TLD, after the preferred domain checks, Sitebulb was suggesting you crawl https://co.za. The fuck, Sitebulb??
#5 Fixed mis-firing of "<head> contains invalid HTML elements"
Our new ohgm-inspired Hint was accidentally firing when it found references to <p> in a script in the head. Our bad.
#6 Fixed crappy word counts
Occasionally Sitebulb's algorithm for figuring out the content area (and thus, the 'Content Words' and the 'Template Words) would get one mixed up with the other, the other one mixed up in both, and both mixed up in all. Normally such errors are due to really shit HTML, but in this case we found an example that was entirely Sitebulb's fault.
#7 Duplicate Content export button now wired up correctly
From the 'All Hints' page, the export button associated with Duplicate Content did nothing, because it was not wired up to do anything. It is now.
Released on 24th November 2017
#1 Preferred domain check
When you start a new Project, Sitebulb will go off and check that the start URL is the right URL, right after you enter it. It checks the http/https/www/non-www versions to see how each responds, and advises you which it thinks is the best option.
If you tend to copy/paste URLs from your browser, you may not notice a lot of difference here. But if you're a URL-typer, this could save you a ton of wasted time.
Of course, this may also reveal some issues that you need to get fixed!
#2 Crawl faster (with threads)
We've been rather reluctant to add the ability to crawl with threads, for very good reason, as this method of crawling is notoriously bad for crashing servers and pulling down websites. But it's such a regular request that we figured we'd better do something about it.
So we've added the ability to crawl with threads (up to a maximum of 25, beyond which point the tool does not really work any faster).
We've written a guide on How to Crawl Really Fast as well as its counterpoint: How to Crawl Responsibly, which emphasises all the reasons we were reluctant to do this in the first place. We'd encourage all users to read the second piece, as it should also give you a better understanding of how the tool works.
#3 Better 404 Testing
Up until now, we've had 404 tests visible on the URL Resolution section on the Audit Overview. The results they've thrown up have caused us a number of questions, so it's clear that what we had before was not clear enough. Clearly.
In order to be clearer, we have moved the 404 Tests to a tab on the Internal URLs report. Not only that, we've added a lot more tests, and more details about what we are testing and what the tests show.
This constitutes much more thorough 404 testing - checking for pages, folders, images, CSS, text files and XML. Each should respond with a 404 response.
We have *ahem* deliberately left our site misconfigured, for the purpose of this demonstration. The sacrifices I make for you guys eh?
#4 More crawl control: crawl canonical, pagination and alternate URLs
Some users have asked for more control of what Sitebulb will crawl, specifically relating to canonicals and pagination links, so we've added some new crawl controls in Advanced Settings (under the 'Robots' tab).
By default, Sitebulb WILL schedule and crawl any canonical URLs, alternate URLs or pagination URLs that it finds - either in the <head> or in HTTP headers. In order to STOP Sitebulb crawling any of these URLs, you'll need to tick the appropriate box.
Note that if a URL you wish to stop Sitebulb crawling is also linked to via anchors, it will still get scheduled and crawled. You'd need to use 'Excluded URLs' in Advanced Settings.
#5 Pre-audit notifies you about site features
#6 Loads of new Hints
Gareth's favourite saying is 'the internet is broken.' While most blokes while away their weekends watching sport or drinking beer, Gareth prefers to sit in a darkened room, trawling the internet to find new examples of shitty web pages. He's been known to exclaim, without warning, 'JUST LOOK AT THE SIZE OF THAT HEAD.'
New Hints as follows:
- HTML is missing or empty - literally, pages with absolutely nothing on (consider exhibit 1 and exhibit 2)
- Has link with a URL referencing a local or UNC file path - people do some stupid shit when putting together web pages, like linking to files that no one in the world can actually see.
- Has link with a URL referencing LocalHost or 127.0.0.1 - say it with me, 'the internet is broken.'
- Has a link with whitespace in href attribute - like, if someone accidentally put a space at the end of the URL, à la href="https://example.com/page/%20". How embarrassing.
- Next/Prev Paginated URL is canonicalized to different URL - I mean, what are you even playing at with this shit?! If you canonicalize a paginated page, Google is not going to crawl the rest of the paginated series. Dumbass.
- Noindex found on rel Next/Prev Paginated URL - Oh. This one is not that bad actually. But nice to know, I guess.
- Internal/Resource URL is part of a chained redirect loop - This is more like it. Redirect chains that go round in a big loop, like 1 -> 2 -> 3 -> 4 -> 1. Internet = broke.
I'm not finished yet, we have three more. These were all inspired by serial-internet-breaker @ohgm in his latest escapade 'Breaking the Head (Quietly)'. Stop reading my drivel, go and read his instead. Then come back and appreciate these 3 new Hints:
- <head> contains a <noscript> tag
- <head> contains a <noscript> tag, which includes an image
- <head> contains invalid HTML elements
We'll be getting some Sitebulb t-shirts printed with #internetisbroken if enough of you start tweeting out the hashtag. I'll be the judge of when enough is enough, thank you very much.
#7 Search for a Hint
You can view all Hint data via the All Hints screen (top right nav) or via the Hints tab on the Audit Overview. Both of these now have a 'search for a Hint' box. Note that this only searches triggered Hints, so if you search for a Hint and it doesn't come up, then that means it was a green tick pass.
#8 Autoscroll to previous view scroll location
You're using Sitebulb in full blown investigation mode, you're out to cracks some heads together today. You're checking all the reports, inspecting each and every graph, looking for something, something. You know it's there somewhere. A string for you to pull on. Then all of a sudden, you feel your spidey sense tingle. A graph. A pattern. That might be it. It might just be. You have to know more. So you click, and you're there, IN the data. You're Neo now, and you can see everything for what it truly is. You can feel it, you're right on top of it. You pause, not wanting to get ahead of yourself, trying to slow the heart beating out of your chest. You want to double check the graph, make sure you've got it right. You hit 'back' and WAIT. WHAT?! WHICH GRAPH WAS I LOOKING AT AGAIN?? NOOOOOOOOOOOOOOOOOOOOOO!
No one can be told what the Autoscroll to previous view scroll location is. You have to see it for yourself:
#9 Select Google Data for up to 90 days
Thus far, Sitebulb has offered a paltry 30 days worth of data to check. With this 3X update, you can now select up to 90 (ninety) days worth.
#10 Added Google Analytics Page Timings Data
Sitebulb has a Google Analytics integration. It also does Site Speed testing. We figured, why not mash these things together and pull out the GA Page Timings data?
This is what you'll get in the Site Speed report (on a separate tab):
You'll get this by ticking 'Site Speed' and selecting a Google Analytics profile, when setting up the Project.
We plan to iterate on this feature to make it more useful, so please hit us up with any ideas you have about it.
#11 Added Happiness feedback to toolbar
Here in the world of desktop SEO software, we live in a vacuum for most of our lives. And no, I don't mean we live in a fucking vacuum cleaner, I mean that we don't really get to talk to our customers very often.
So we are always looking for new ways to elicit feedback. Our latest idea is a happiness button in the toolbar:
#12 Low disk space warning
Not everyone is aware that Sitebulb does not hold data in RAM (like most desktop crawlers) but writes to disk instead. This means that if you're crawling a big site and you don't have much disk space, bad things are going to happen.
To mitigate this risk, Sitebulb will now warn you if you've got less than 5 GB space remaining - which is where you might want to start thinking about it.
If you're the type who likes to live life on the edge, you can simply dismiss this message with the handy 'X'.
#13 New column added: "No. Outgoing Navigation Links"
This is part of a wider plan, to offer more visibility to internal linking. It starts with this acorn, splitting out navigation links on a page. You'll find it as a new column in URL Lists, so feel free to have a play with it and tell us what you think.
Otherwise... watch this space.
#1 Fixed Bosnian language code on hreflang check
In the International report, the language code 'bs' was being incorrectly labelled as invalid. Thanks to the helpful user who pointed out that Sitebulb was talking complete BS! (sorry)
#2 Corrected colour coding on compare audits
Our vaunted compare audits feature includes helpful colour coding, so you can quickly see what has improved or disproved unproved got worse (using the universal colours for good and bad: green and red). Previously it was doing this in a 'dumb' way, where any increase was green. Now, using a ground breaking combination of AI and machine learning, it correctly figures out things like 'less 404 errors is actually a good thing.' Ground = broken.
#3 Fixed the All Hints page showing SEO and HTML Hints grouped together
The Sitebulb community has been up in arms about this one! Hundreds of you pointed out that the On Page section is split into two Hint groups: SEO and HTML, and combining them as one on the All Hints page was an abomination. Put your pitchforks back in the pitchfork cupboard, people, for they are now separate.
#4 Fixed Hint for 'Base URL Malformed'
Sitebulb was incorrectly claiming that <base href="/"/> was an illegitimate base URL.
#5 Changed sheet name for Hint: Has only one followed internal linking URL
On the export for this Hint, the sheet name used to be called 'Has 1 incoming link'. This has been changed to 'Pages with only 1 linking URL', which is more accurate, and because Sitebulb users are magnificently anal and we love them for it.
#6 Syntax support in exclusion list
One of our users noticed that $ symbols were not being correctly recognised when used in the 'Exclude URLs' setting, so URLs were not being excluded as they should. This has been resolved, so feel free to throw your $$$ around like a mother fucking gangster bro.
#7 Fixed the back button on Audits
If you followed this path: Dashboard -> Project -> Audit, and then hit the Back button, you'd be returned to the Dashboard, rather than the Project.
#8 Fixed: Multiple GA codes Hint
This Hint was occasionally firing false positives, for instance if a GA code was referenced in a script. We've now fixed it so that it only reports if 2 or more different GA codes are found.
#9 Fixed Hint: URL receives follow and nofollow links
In some cases, this was not correctly reporting the nofollow links, making the Hint pretty useless.
#10 Fixed: Cache headers reporting invalid
Cache headers were not being reported correctly, which was incorrectly firing this Site Speed Hint: Set long expires cache headers.
Released on 3rd November 2017 (hotfix version)
#1 Fixed export on compare audits
Per update 1.5.0, we added a ton of new exports, changing the way exports are built in the process. This managed to break the export associated with 'Compare Audits', which was pointed out to us 13 seconds after we launched 1.5.1 (thus inspiring our new hotfix release).
#2 URL resolution checks (404, Non-WWW, etc) on some sites were being rejected by the server
You probably won't notice much difference here, but the little URL Resolution checks on the Audit Overview were failing on some sites and showing inaccurate results. Rest assured that everything is ok now.
Released on 2nd November 2017.
#1 Historical trend data via sparklines
How awesome is this?
Carry out more than one audit within a project, and you'll be presented with little sparklines everywhere which show changes over time. They appear on all the main 'Insight' numbers as below, and alongside all the Hints.
Up and to the right people!
3 important things to note:
- This will work for new AND existing audits, so you check it out right now if you have a Project with a few audits on.
- Even if you delete your old audits, this history will persist so the sparklines will still show your trend data.
- If you're just dialing in your technical SEO, your sparklines will remain flat forever, making you look like the lazy, no good, piece of shit that you are.
#2 New report: Duplicate Content
Duplicate Content is one of those things you have to check on pretty much every single audit, and when it gets out of hand can cause a whole bunch of problems. Previously, we had a tiny little section for this, squashed into the On Page report. We've gone the other way with it now, and it's got a full blown section all for itself.
This is the type of thing you can enjoy seeing in the duplicate content section (plus a whole load more):
A word of warning, however. Despite all the pretty graphs, you can't click through and view duplicate content data in URL Lists. You will need to grab the export file instead (the green button at the top), which will give you everything you need to fix your dup content problems.
The reason for this is because URL Lists are entirely inadequate for communicating duplicate content data. It's not intuitive why not, so I'll explain a little further. URL Lists are built to display 1-to-1 data. There is 1 URL per row, and all the data on that row relates only to the unique URL in question.
However, it we take duplicate Page Titles as an example, in that instance we need to say 'here is a URL with the page title of '10 Cat Pictures that are so cute you will just cry and then you won't believe what happened next' (for example), but here are also 73 other URLs with the exact same title.' So it is more like a 1-to-many relationship that we need to communicate, where the 73 are somehow grouped and associated with the 'original URL.'
Anyway, we are working on a better way of displaying this, so for now just use the export instead.
#3 New charts added to the On Page report
To fill the void left by duplicate content, we've added some new charts to the On Page report, highlighting critical on-page SEO elements.
There are 6 new pie charts in total, displaying data about titles, meta descriptions and H1s - to do with both length and presence thereof - which should hopefully make it easier to pick out optimization opportunities.
Additionally, from the global settings area, users can define the values for 'too short' and 'too long', so you can be captain of your ship, vis-à-vis the length of on-page SEO elements.
#4 Sitebulb users are no longer forced to download the latest version!
Jonny Rockstar, a self-proclaimed SEO guru, takes a call from a client in his swanky attic office:
Jonny: You've reached Rockstar SEO, I'm Jonny Rockstar, how can I help you today?
Client: Hi, I was wondering if you could help me with my SEO?
Jonny: Today's your lucky day my friend, you're reached the right place! I am the world's top expert on SEO don't you know?
Client: Oh wow, you sound amazing. I wish I was you. So you can help me?
Jonny: I can do anything. What's your website, I'll take a quick look now for you.
Client: Great! My site is secondhandsocks.com
Jonny: Sure, I'll just fire up my...Oh.
Client: What's wrong?
Jonny: FOR FUCK SAKE. I NEED TO UPDATE AGAIN?! I JUST DID IT LAST WEEK. YOU STUPID PIECE OF SHIT.
Client: (hangs up)
Sound familiar? Well, from this version onwards*, you won't be forced** to update if you don't want to***.
* The change only exists in the 1.5.0, so you kinda will be forced to update this one. And by kinda I mean completely.
** We will however notify you of new versions in future, and strongly suggest you install them.
*** Also if we release a critical update, we actually will force you do update it. Because critical.
#5 Export internal links directly from URL Lists
We're going to town with exports in this update. This one is a cheeky little update to URL Lists, that allows you to instantly export incoming internal linking URLs (to the URL in your list).
Of course you could just click the blue 'View URLs' link and export from there, but this saves you 1 less click. Think of the dozens of seconds this will save you across a whole year! This thing is like a freaking time machine!
#6 Added export button to individual Hints
For even more exporting joy, you can now export individual Hint data directly from URL Lists.
...and on the Hint page itself (#7)
The important thing to note about both these Hint exports is that this is not always equivalent to exporting the rows from the URL List. This is because not all Hints are created equal.
Some Hints need special treatment (a bit like duplicate content, see above), so those Hints have customised exports. Some examples: broken links, images without alt text, and redirects. These are more cases where there is a many-to-1 relationship, so the exports are built to handle this.
#8 New copy experiment when building PDF reports
PDF reports take about 30 seconds to build, and we used to have a message up there encouraging feedback while you wait. We've just changed it for something a little more fun. Let us know what you think!
#9 Scroll to adjusted column
This is one of those tiny UX changes that new users will not notice at all, but Sitebulb veterans will love (Aside: can you be a veteran for a product that's less than 6 weeks old?).
Anyway, whenever you adjust a column in URL Lists (i.e. sort or filter), Sitebulb will now scroll you along back to that column. Previously it would dump you back on the first column, which was très annoying.
I'll demonstrate through the medium of gif:
#10 Free users can now download exports
We have a free plan, for crawling sites up to 500 URLs. This is perfect for users with small sites, or for people who hate paying for things. We'd left a restriction on there meaning free users could not process any exports. We decided this was a bit tight, so we've now enabled exporting for all free users.
Who's tight now eh?
#1 Fixed pause and resume!
Sorry! I know, I know. Pause and resume was broken, so if you paused an audit, you couldn't resume :(
We feel partially responsible for this as well.
Maybe in time we can forgive each other...?
#2 AMP Hints had a (small) makeover
A number of people pointed out that these two AMP Hints were bogus:
- AMP URL is not indexable
- AMP URL is not in a sitemap
#1 was happening because of the way general URLs are classified as indexable or not indexable, which is of course impacted by canonicals. Since AMP URLs are MEANT to have canonicals, this Hint did not make a lot of sense.
#2 was only advisory anyway, but we removed it because I got fed up of people pointing out to me that 'John Mueller said you don't need AMP URLs in sitemaps therefore you're wrong.' I don't need the hassle in my life.
We also removed AMP URLs from appearing in a lot of the other reports, as they really need to be treated in their own AMP-esque context.
#3 HTTP headers now correctly parsing canonical link
We came across a site that was setting and image source link element AND a canonical link element in the HTTP headers. Sitebulb was not correctly identifying the canonical, happily firing off warnings left, right and centre.
Magnificent as this Hint is, Sitebulb was actually wrong to fire it in this instance, as the canonical setup used was perfectly valid.
#4 You can once again click on nodes in Crawl Maps
I have a bone to pick with you. All of you. No one bloody told me that the 'click node to view URL' feature in Crawl Maps had stopped working.
Or is this your collective way of telling me you didn't know it could do it??
It's awesome, let me try and sell it to you... -> Click on any node and BOOM straight to the URL details!
How did I do?
#5 Special characters removed from all export sheetnames
We should have seen this one coming. Mainly because we'd already spotted it once before relating to slashes (/s). Some Hint exports were failing, because we'd left some special characters in the sheetnames. F$&k my life.
#6 Maximize no longer cuts off scrollbars
Some people are overly fixated on size. Almost like they're covering up for something...
Anyway, some users like to maximize Sitebulb so it takes up the whole screen. Which is FINE. Unfortunately Sitebulb was penalising these size-obsessed freaks by cutting off the scrollbars on the URL Details page, making it look like you couldn't scroll down.
#7 Fixed issue where some exports were getting stuck
In the last release, we optimized the exporting process to make it a lot faster. Turns out we made it too fast, and the UI was actually struggling to keep up with it. This will have resulted in some exports getting stuck (either at 2% or 95%, for some reason) on some audits, some of the time. If I had to put a number on it, I'd say it affected precisely 4.1279% of all exports carried out.
#8 Typo fixed in BOM Hint
A couple of days ago I shared something cool with the world - the BOM hint that Sitebulb triggers (hat tip to Glenn Gabe for his work on this). Whilst most were suitably awestruck, one individual took it upon himself to publicly shame us for our child-like spelling mistake.
What gives? Nerd.
10 new features and 8 fixes in one update?! A big one indeed (that's what she said).
Released on 13th October 2017.
- We've made crawling and processing data more efficient, so Sitebulb will now crawl faster, and on most websites, is better able to approach the max URLs/second limit you set. We've also changed the way crawl speed is reported, switching to a cumulative average, and added a new metric 'Current TTFB.' If you see Current TTFB increase, you will see a decrease in crawl speed, for the two are as entwined as a dragon queen and a Northern bastard.
- Since speed is on the agenda, we've also improved the way the user interface queries data, making it imperceptibly faster. You're welcome.
- It appears that no one is really using the report exports we spent a million hours trying to get right (thanks, guys). So we have tried to make them more visible by building a 'Bulk Exports' page, which explains each one and allows you to download them all (including a 'download all' button as well).
- Since we needed to add another button, we also figured you'd appreciate a massive UI change. If you look to the top right, you'll now find Filtered URL Lists, Bulk Exports, Printable Reports, Crawl Maps and All Hints.
- Advanced settings now pops up underneath the main settings when you click the 'Advanced Settings' button, instead of shifting you to a new screen, which some users were finding unclear (and we're all about clarity here at Sitebulb).
- Occasionally, if a server hates you crawling it, it will return a 429 HTTP status (too many requests). This is the server's was of saying 'would you kindly fuck off now?' Sitebulb will now take this advice, and stop trying to crawl the site.
- Sitebulb was claiming the hreflang nn-no was invalid, when it is in fact perfectly valid (nn is Norwegian Nynorsk, as you were no doubt already aware). Tilgje meg.
- The Indexability export was not pulling through the Indexability Hint data, which is should have been. This is because it was still looking for the 'indexation' data, which was renamed recently to avoid Barry Adams losing his shit every time the word was uttered.
- Sitebulb was forgetting crawl and analysis settings when re-auditing, which was super annoying. Like when you go upstairs to get your phone charger and get distracted by a fly relentlessly smashing itself against a closed window and in a blind rage you chase it across the house with the sport section of Saturday's Guardian, swinging aimlessly (and frankly, embarrassing yourself) and you have no idea what you were writing about when you first started this sentence.
- Tidied up a few of the exports, which had got a bit unruly.
Released on 3rd October 2017.
- New feature alert! Printable PDF Exports are now in town. You asked (again and again and again), so we delivered. Since this is a real big boy update, it deserves more than a snarky comment from me, so it's got it's own fully fledged user guide. Because my little comment isn't good enough.
- I mean, let's be frank, the user guide literally says 'click this button and save it', you can't work that out on your own?
- I Am Jack's Inflamed Sense of Rejection.
- Mac users! You can now hide the app with command+H, and quit the app with command+Q. Now, please, enough with the death threats ok?
- Regex enthusiasts, we've got something for you too! You can now filter columns using regex. It's a bit slower than the normal filtering method, but at least it's there now, ok?
- Adjusted the Dashboard to allow you switch between Projects/Imported Audits/Paused Audits/Queued Audits/Interrupted Audits, which makes it a lot clearer which state each audit is in currently.
- Added a link to our new Feature Requests board - in the left navigation on the dashboard.
- Adjusted the robots.txt checking, so that it now treats this sort of rule: Disallow */example-path/ in the same manner as Google. Note that this is not listed as a fix, because Sitebulb has been doing it correctly this whole time. Robots commands should always start with a / but it seems Google knows what you are going for so just lets it fly. Sitebulb will do the same, but only because it wants to. A dragon is not a slave.
- Fixed an annoying issue that has been lingering around for a while, but we couldn't find out what was causing it. From the Audit Progress screen, if you viewed Realtime URLs and then clicked the back button, Sitebulb would start another audit. Yeah, it's not really meant to do that.
- This one stumped a few of you as well (sorry). The hint HTTPS URLs links to HTTP was wired up totally wrong. Y'all were emailing me asking what you'd done wrong, and it was not you, but us. Stop blaming yourself. It isn't your fault!
- URL Details were not showing nofollowed incoming links. This is fixed now. Page Rank Sculptors, breathe easy.
- Adjusted the duplicate content checking algorithm to stop it flagging some false positives. It was claiming some URLs were duplicate, that had completely different content on them. Hardly Panda arousing.
- Improved checking for empty or missing H1s, as it was missing some in certain circumstances.
- Fixed a typo on the setting page, we'd gone all Slumdog 'Who Wants to Be a Millonnaire.'
- Over the last week we've had a few instances that have experienced database corrupting, which is a massive ballache. We've not been able to recreate this issue, but we've made a couple of critical updates to the way we write and read data. Fingers crossed that's it sorted. Symptoms of this are: Sitebulb doesn't fucking work at all. Please let us know if this happens to you.
Released on 27th September 2017.
- We've added a new all singing all dancing Hints section to the Overview so you can see exactly how many Hints were triggered across all the reports (like this). Mind = blown.
- Also added a little flag to show how many Hints were triggered on each report (like this) so you know when you done fucked up. If you happen to actually be good at your job, you'll earn Sitebulb's respect (like this). By the way did you see whose site that was? Not so cocky now are you?
- We have adjusted the Hint 'Contains links with no anchor text', so that Sitebulb now only looks at internal links, where previously it was looking at any old link. This was adjusted because all Sitebulb users that we surveyed said they give zero fucks about optimising anchor text on external links.
- The Link Equity Score has been changed from having 4 decimal places to 2 decimal places, because brevity > precision 99.7645% of the time.
- When you pause an Audit and go to 'Update Settings', we've added a cancel button down at the bottom, because one user reported being frightened by the Save button (note: those may not have been his exact words).
- When you export any of the charts as images, they will now contain the name of chart in the filename, rather than the accurate-though-pretty-fucking-useless alternative: 'chart.png.'
- Fixed a crazy error which happened if you enabled Site Speed and looked at the export, where there would be tons of additional 'B Score Worksheets'. I'll be honest, it wasn't really meant to do that, so now it doesn't.
- In a CRO experiment, we tried to encourage users to sign up for multiple subscriptions by opening 31 browser tabs when you clicked 'Update to Pro' (when on a Sitebulb Trial). Astonishingly, the experiment failed.
- On some URL lists, if you added/removed columns, Sitebulb would totally ignore you and just displayed whatever it wanted to show, like a petulant teenager. We told Sitebulb to stop being an obnoxious little shit, and this behaviour has miraculously improved (for now...)
- The protocol column was displaying everything as http, even when it was actually https. This is because Gareth, in his infinite wisdom, had trimmed the database column down to 4 characters, because he's stuck in 2007 and literally 'forgot' about https. In other news, 2007 was TEN YEARS AGO. We know it's shocking but don't worry, we're here for you.
- Fixed a small typo that one of our users noticed - in the settings tooltip for XML Sitemaps - it said 'and' instead of 'any'. I was tempted to change it to 'anal.'
- In the 'mixed content' Hint, we'd built the export to include a slash (/) in the filename. Anyone who's ever used Windows ever knows that WINDOWS HATES SLASHES. We should have known better.
- Some servers were throwing a hissy fit when reading our Accept request headers and responding with a Internal Server Error 500 (which meant you couldn't even do the pre-audit, never mind crawl the site). 99.9999% of you would never have noticed, but we've fixed it for the 0.0001%, because precision is very important at all times.
Released on 18th September 2017.
- Added some more data to the hover state on Crawl Maps. Now, when you hover over a node, you'll also see the page title, 'First found on URL' (i.e. the parent URL) and the Link Equity Score. Because more is always better.
- Changed the name of the 'Indexation' report to 'Indexability' instead. Internally referred to as 'the Barry Adams update.'
- Added the option to specify a custom user agent. You can do this either from the Advanced Settings for a single Audit (comme ci) or from the global settings for every Audit (comme ça). Et voilà!
- When you queue Audits, you'll now see a red icon alongside the 'Queued Audits' left navigation button, which shows the number of queued Audits you have. So it will show 1 for when you've got one queued Audit, and then 2 when you've got two. Basically it will increase by one every time you queue another Audit. You see? One more every time. It's just math(s).
- When you terminate an audit, the old warning message we had was 'Terminate the audit and delete data collected'. Some people thought we were bullshitting about the 'delete data collected' bit and called our bluff, only to be bitterly disappointed when their data wasn't there. Since 'I told you so' does not make for a favourable customer experience, we've re-jigged the warning message to make it clearer what will happen. We included some words in bold and everything.
- After our previous update a number of beta users were unceremoniously dumped onto our 'free' tier. They should have ended up on a 'Free Trial', which is a different thing entirely, and is pretty much indistinguishable from beta. Remember that 'soft launch' we were going on about? Yeah, this is why.
Released on 14th September 2017.
- Period fans, rejoice! (no, not that kind of period). I mean these little badgers ->. We call them 'full stops' in good old Blighty. Anyway, as per beta feedback, you can now have a period in your Project name, should such urgent need arise. Read that last sentence as you see fit.
- Several months ago Gareth and I had a very lengthy 'meeting' about whether we should use the magnificent 'log in/out' or the banal 'sign in/out'. The support for both options was fierce, and the argument raged on into the wee hours. Bloodied and limping, Gareth finally announced 'sign in' as the victor, before collapsing in a heap. Despite losing an arm in the battle, my belief in 'log in' was unwavering, and I managed to sneak her onto the 'Account' screen in the tool. It would be an understatement to say I suffered Gareth's wrath when he recently realised. I remain, persisting in a swimming pool of catharsis, overcome with grief for my fallen friend, as 'log in' is no more :(
- Added a 'Support' button to the top right of the user interface, so you can always easily get in touch with support.
- Curiously, Sitebulb had stopped asking for authentication details when trying to audit a site which uses HTTP authentication. The sharp witted among you will realise this means it couldn't actually crawl these sites, which is a bit of a problem for a crawler product like Sitebulb. It will once again ask you for authentication details (which are then saved against future re-audits, and available via Advanced Settings - FYI).
- In very specific conditions, the HTML parser was breaking, and it isn't anymore. That's all we know.
- When you Pause an Audit and go to 'Update Settings', we had a devious Back button on that screen. This duplicitous fellow would sometimes take you back to the Audit, and sometimes back to the Dashboard. The wily trickster was proving difficult to control, so we banished him completely (don't worry, you can still return to the Audit by pressing 'Save Settings' at the bottom of the screen).