January bandwidth warning -> cause of usage found??

This forum provides an FAQ for users as well as allowing you to ask questions about MundayWeb's very popular BOINC stats services.
User avatar
Halifax--lad
Established Member
Posts: 228
Joined: Sat Sep 10, 2005 11:27 pm

Post by Halifax--lad » Thu Feb 02, 2006 8:20 am

Mos RSS feds I have all have clicable links within them it makes it much easier as I dont have to visit the site to open any links
Join us in Chat (see the forum) Click the Sig
Image

Neil
Site Admin
Posts: 1349
Joined: Mon Apr 18, 2005 8:35 pm
Location: UK
Contact:

Post by Neil » Thu Feb 02, 2006 8:49 pm

Lee Carre wrote:my initial reaction is to ask why you'd want this? obviously it'd be nice, but a feed is ment to be read by a reader (hence serving it with a suggested MIME type of "application/xml" (well actually "application/rss+xml") meaning that it needs to be processed by an application before being presented to the user, it's not just plain text otherwise it would be "text/xml")

the news section on your home page is for regular browser use, the feed is just another means to get that same news, it's not ment to replace it
the raw data in a feed isn't ment to be seen by users, it's for feed readers to parse, then display to users
a web browser is needed to render (X)HTML documents, and a feed reader is needed to render RSS/Atom feeds, the two are seperate things, a feed isn't ment to be a web page

it basically comes down to a choice of which format a user perfers
however, i'll look into this
I was just wondering if it was possible out of curiousity. I have done some hunting around and have found that the BBC for example, use XLS to make their RSS feeds displayable in a web browser.

However, I'll be sticking with CSS!

Neil.

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

Post by Lee Carre » Sat Feb 04, 2006 2:42 am

Halifax--lad wrote:Mos(t) RSS fe(e)ds I have all have clicable links within them it makes it much easier as I dont have to visit the site to open any links
but i'm guessing that's in your rss reader, which is what neil's done (which is good)
as far as i can tell, he wants to be able to view the feed in a browser (with out being parsed by a feed/rss parser) and have it behave like a web page
Last edited by Lee Carre on Wed Mar 15, 2006 3:53 am, edited 3 times in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

Post by Lee Carre » Sat Feb 04, 2006 2:46 am

Neil wrote:I was just wondering if it was possible out of curiousity. I have done some hunting around and have found that the BBC for example, use XLS to make their RSS feeds displayable in a web browser.

However, I'll be sticking with CSS!
well, at present, i have to be honest and say i don't know, if i had to guess, i'd say it's doubtful, but it might be possible, i'm still looking into it

CSS is fine for feeds, it's only to make them pretty if someone stumbles across them using a browser (if they don't know what rss is for) move to XLS if you want to, but there's nothing wrong with CSS, more browsers support CSS anyway, so it's the safer option, especially considering most people still use IE as their main browser (:()

on another note, hows the caching side of things comming along?
Last edited by Lee Carre on Wed Mar 15, 2006 3:54 am, edited 1 time in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

Neil
Site Admin
Posts: 1349
Joined: Mon Apr 18, 2005 8:35 pm
Location: UK
Contact:

Post by Neil » Sat Feb 04, 2006 12:57 pm

Lee Carre wrote:on another note, hows the caching side of things comming along?
Good so far... checked the stats yesterday and the GoogleBot has only consumed a few MB so far. For the first time ever, it's used less bandwidth than the MSNBot!

Neil.

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

Post by Lee Carre » Mon Feb 06, 2006 9:06 pm

Neil wrote:
Lee Carre wrote:hows the caching side of things comming along?
Good so far... checked the stats yesterday and the GoogleBot has only consumed a few MB so far.
if you use last-modified it'll use even less, almost nothing if most of your pages don't change, and your human visitors will notice it too
Neil wrote:For the first time ever, it's used less bandwidth than the MSNBot!
well, i'll hazard a guess that MSNBot is probably crude, and badly writen, like most of microsoft's other attempts at web services
Last edited by Lee Carre on Wed Mar 15, 2006 3:54 am, edited 1 time in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

User avatar
clarkf1
Established Member
Posts: 175
Joined: Sun Jun 05, 2005 11:36 am

Post by clarkf1 » Mon Feb 06, 2006 11:49 pm

I read at another site that the MSNbot is particularly fond of harvesting email addresses
Image
Cruncher: i7-8700K (6 cores, 12 threads), 16GB DDR4-3000Mhz, 8GB NVIDIA 1080

User avatar
Halifax--lad
Established Member
Posts: 228
Joined: Sat Sep 10, 2005 11:27 pm

Post by Halifax--lad » Mon Feb 06, 2006 11:52 pm

It wouldn't surprise me if all Bots collected this information. god knows who would use them if they could get hold of information such as that.

Got enough SPAM as it is already
Join us in Chat (see the forum) Click the Sig
Image

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

email harvesting

Post by Lee Carre » Tue Feb 07, 2006 1:22 am

well indeed, that's why plain text email addresses shouldn't be posted anywhere (unless the owner wants spam, dunno why they'd *want* it thou)

a bot will fetch the whole page, regardless of content, if it contains an email address, then it'll be saved too (for example, google lets you view cached versions of pages it crawls)

but it wouldn't surprise me if the MSNBot actively seeks addresses, and tries to get round the methods employed to defeat email bots
Last edited by Lee Carre on Wed Mar 15, 2006 3:54 am, edited 1 time in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

rss feed links not being rendered in firefox

Post by Lee Carre » Tue Feb 07, 2006 1:30 am

as this is a seperate subject, a seperate post...

regarding anchor links of an RSS feed being rendered in firefox, from the little managed to find i don't think it's possible (unless a user has an RSS reader extension for FF) because they're seperate data formats, and probably because the description content is in a CDATA block, which is viewed as CDATA (which contains other data) rather than parsing the contents to see if it's HTML (as it could be a number of things)

it might help if the feed is served as application/rss+xml (then at least the browser knows what it is, and may render the links) or at least application/xml rather than just text/xml (which implies it doesn't need any processing, but application/xml implies that it does need processing before it can be displayed)

if it's served correctly as application/rss+xml (or for Atom application/atom+xml) then the browser might hand the feed to another application (if it's configured to do so) which would at least be more user friendly
Last edited by Lee Carre on Wed Mar 15, 2006 3:55 am, edited 1 time in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

Neil
Site Admin
Posts: 1349
Joined: Mon Apr 18, 2005 8:35 pm
Location: UK
Contact:

Re: email harvesting

Post by Neil » Tue Feb 07, 2006 6:27 pm

Lee Carre wrote:well indeed, that's why plain text email addresses shouldn't be posted anywhere (unless the owner wants spam, dunno why they'd *want* it thou)
I for one get spam daily at my admin account at my boinc subdomain, as well at my main mundayweb e-mail account.

I have to admit that both addresses are on my web sites, though the latter is not on as many pages as the former.

On the RSS feed front, since last week, my site has used "application/rss+xml" in the RSS link ;-)

Neil.

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

Re: email harvesting

Post by Lee Carre » Thu Feb 09, 2006 3:40 am

Neil wrote:I for one get spam daily at my admin account at my boinc subdomain, as well at my main mundayweb e-mail account.

I have to admit that both addresses are on my web sites, though the latter is not on as many pages as the former.
there are various ways of using javascript to produce a clickable mailto: link but so the code is writen in a way that a bot can't decode (javascript constructs the link from the components, basically)

for an example, and the method i use see my contact page (development site on a dynamic IP address, that URL uses the current IP address, which may change at any time)
Neil wrote:On the RSS feed front, since last week, my site has used "application/rss+xml" in the RSS link ;-)
good :) but the actual feed is still served as text/xml, it should at least be application/xml
use a HTTP viewer and enter the URL of your feed, and look at the line

Code: Select all

Content-Type: text/xml
you should just be able to edit your php to send a different Content-Type header
Last edited by Lee Carre on Wed Mar 15, 2006 3:55 am, edited 1 time in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

Neil
Site Admin
Posts: 1349
Joined: Mon Apr 18, 2005 8:35 pm
Location: UK
Contact:

Re: email harvesting

Post by Neil » Thu Feb 09, 2006 1:52 pm

Lee Carre wrote:...but the actual feed is still served as text/xml, it should at least be application/xml
use a HTTP viewer and enter the URL of your feed, and look at the line

Code: Select all

Content-Type: text/xml
you should just be able to edit your php to send a different Content-Type header
Oooops... You are correct, my mistake.

Thanks for pointing it out ;-)

I'll get it changed when I return from my business trip.

Neil.

User avatar
Lee Carre
Member
Posts: 17
Joined: Tue Jan 31, 2006 1:34 am
Location: Jersey, Channel Islands, Great Britain (UK)

Post by Lee Carre » Fri Feb 10, 2006 9:42 am

Lee Carre wrote:
Neil wrote:Lee -> I feel that we're almost there! The CDATA stuff seems to work, but in FireFox (at least), it doesn't render the anchor tags (i.e. make clickable links).

Is there anyway to do this?

Thanks for your help
my initial reaction is to ask why you'd want this?
...
UPDATE:
ok, CDATA means "don't parse this, just leave it alone",
so CDATA is treated as a chunk of data, as a whole, to be delt with later, so i highly doubt that you'll get a web browser to render it, at least not with RSS 2.0

i've been looking round, and atom might be able to do what you want, and it's a generally better format anyway, what caught my eye was that you can specify

Code: Select all

type="html"
for <title>, <content> and <summary>, so if firefox is clever (and maybe IE as well, but i'm not gonna hold my breath) the HTML content will be rendered in-browser
but to get good results with atom, you need to at least serve it as application/xml (ideally it should be the recommended MIME type of application/atom+xml)
Last edited by Lee Carre on Wed Mar 15, 2006 3:56 am, edited 2 times in total.
Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Plugins

Neil
Site Admin
Posts: 1349
Joined: Mon Apr 18, 2005 8:35 pm
Location: UK
Contact:

Post by Neil » Sun Feb 12, 2006 9:06 pm

Thanks for the info ;-)

Neil.

Post Reply