A program returns a list of sites it deems relevant or connected to your search.
Matches terms in your search against a pre-existing index of websites.
Examples include: Google, Bing, Yahoo, AOL
Relevance: engines need to produce results that match your search, results should be delivered quickly and pages should be trusted.
Indexing: search engine searches a database called an index to quickly return results, index is pre-populated and uses a web crawler.
Web crawlers
Examples include: Googlebot, Bingbot, DuckDuckBot
They visit the site by selecting it from a link or following a link.
Records information (text & metatags)
Records the position of each word on the page
Stores them in an index
Follows links to other sites
Robots.txt can be used to instruct web crawlers on what to look at or not.
Robots.txt
File containing instructions to particular crawlers referred to as robots.txt
Either allow/disallow access to resources.
Does require that user-agents follow the rules.
Ranking a web page
Engines use an algorithm to rank webpages so only relevant results are returned for the query entered by the user.
The result of a specific query is then shown on the results page.
Algorithms are kept secret, to avoid manipulation.
Importance:
The more web pages that link towards your web page, the higher the PR (page rank) for your page.
If a page has lots of incoming links it is likely to be more important than a page with few likes.
Higher PR of pages linking to your page, the higher the PR for your page.
Damping factor:
The factor diminishes the influence of a page’s PR the more links away it is.
In a simplified PR, there is no damping factor.
Client-side vs server-side processing - where it happens
Client-side processing: tasks that are performed on the client (usually the browser)
Server-side processing: tasks that are performed on the server.
Client side:
Render a web page
Adding interactivity to a page
Responding to user input
Animations and other visual effects
Validating a form’s input data, before submission.
Server-side:
Authentication of submitted data (username/password)
Responding to client requests
Retrieving data from a database (bank balances/stock levels)
Performing complex calculations
Client-side
+
Processing can easily be localised (time/nationality)
Faster response on the webpage
Less load for the server
-
Interfered by user executing own JS
Can send user-generated messages that the site author didn’t intend.
JS can be disabled
Server-side
Has knowledge of all connected clients (e.g supports lobbies/game comms)
Persistent storage of data on the server can allow clients to access profiles on any device.
-
Spike in client/page downloads can cause load on server (DDOS attack)
More traffic to server: needs greater bandwidth
Server side processing
Processing that takes place on the webserver. Data is sent from the browser to the server, the server processes it and sends the output back to the browser.
Client side processing
Processing that takes place in the web browser.
Client side processing
Doesn't require data to be sent back and forth meaning code is much more responsive
Code is visible which means it can be copied
The browser may not run the code either because it doesn't have the capability or because the user has intentionally disabled client side code
Server side processing
Takes away the reliance of the browser having the correct interpreter
Hides the code from the user, protecting copyright and avoiding it being amended/circumvented
Puts extra load on the server, at the cost of the company hosting the website
The web
HTML (hyper text markup language): the content of a webpage
CSS (cascading style sheet): the look, feel and layout of a webpage
JS (Javascript): programming language to add interactivity to webpages - e.g validation to data entry web form
HTML
HTML tags come in pairs
Pairs have an opening and closing tag
Tags can be nested with each other
The browser uses tags to understand how to display the content.
Tag
Action
<html>
Start
<head>
Metadata
<title>
Title of tab
<body>
Body (includes all text, images and links)
<hx>
Heading size with x = 1 to 6 (based on size, 1 highest)
<p>
Paragraph
<img>
Image (needs src = file and alt) W/H must be specified - no end tag
<a>
Hyperlink
<ol>
Ordered list
<ul>
Unordered list
<li>
Listed item
Div tags group together all nested tags into a referenced division of the page.
Classes & identifiers are attributes given to elements on a webpage which you wish to style in a particular way.
Multiple elements across web pages can be assigned to a certain class. This means elements can follow a consistent style and the styling & formatting only has to be defined once.
Identifiers are a unique name given to an element on a web page. Only one element can be associated with an identifier, using a hashtag.
CSS
This is used to control the styling of webpages, including headings, text and links.
Border styles can also be controlled.
In CSS, the order of precedence is: ID, class then tag.
CSS can also be used inside HTML → < H1 style= “color:blue” >
JavaScript
It is case sensitive.
Variables are declared without a datatype.
Statements must end with ;
Blocks of code are grouped using curly brackets {braces}
Assignment is =
Test for equality is ==
// is a comment
Concatenation is +
% is modulus
Javascript is A programming language that runs in a web browser (1) that can be embedded into HTML (1) with <script> tag (1) to add interactivity to a page (1)
Issues with not using CSS sheet - The site is slower to access (as the formatting information is reloaded for every page)