HOW TYPOGRAPHY AFFECTS THE USER’S PERCEPTION OF THE WEBPAGE :)

What is typography and why is it important?
The huge development of the printing art in the last decades is the result of the general development of civilization and still growing cultural needs of mankind. Typography is one of the areas which function is to hide in the shadows. Reading articles, scanning the pages and looking for information shouldn’t make user pay attention to the fonts, line spacing and text alignment nor enlarging or reducing the font size. Unfortunately, the need of invisibility also causes difficulty in understanding the significance of it. Typography is art of using the best matched forms of lettering to the media.To choose the right font face it is important to ask a lot of marketing questions as: what’s the length of the text, what’s its purpose, what’s the most important, what the text should help to achieve? And questions about technical issues about readability and aesthetics, the feel of the text (does it have to feel serious and professional or rather unofficial or decorative) and if the text will be read or just scanned by users. If we know the specification of requirements and the audience, it’s easier to choose the right typography for the project and aim all the goals set for the text.

HOW THE USER WANTS TO SEE THE TEXT ON THE WEB – READABILITY

The brain likes harmony, and the eye moves easily through the text if there’s adequate space for it – subsequent lines appear in predictable places. If line height, font size, paddings and margins are matched in a consolidated manner, there should be rhythm in the text block.

It is important that all the subsequent lines repeat the same rhythm, just like the other elements on the page. Picking the right height depends on the font size and face. The higher the size, the height of the line should be higher. Traditionally it is determined that the optimum ratio of the line height to the font size is 1.3 to 2 (or even more for special cases). The selection affects the size of the padding and margins that should be used. To get the elements arranged well and the distances to be accurate, it is very helpful to place the image of horizontal lines and try to put elements so they begin and end with both lines, and the text is among the latter. Defining the height of the line is a less sophisticated method than the baseline alignment but it does the job well, and at the same time is easy to code and maintain.

The least noticeable element of typography is whitespace, but at the same time its possibilities are huge. Whitespace is not only light between the words and line spacing, but also all the spacing between elements. Leaving margins blank is not a waste of space, but assisting the hierarchy. Proper selection of white spaces may also specify the nature of the design.

The font size is also important. Too small fonts, although they are often associated with professional design, are generally difficult to read or at least reduce the speed and ease of reading. Not without reason, almost all browsers as a default font size accept 16px. Too large font can be difficult to read too. Among things that determine readability there is also contrast – it’s expected to be optimal, and people are more accustomed to reading on a light background. Pages with lots of text should avoid dark graphic designs. In recent times it is common for web designers to reduce the contrast between white and black in order not to strain the reader’s eyes.

In all that, the most important thing is moderation and transparency. First of all, you can’t really see good typography – it just melts into the background and lets you read your thing. From a technical side, two fonts make a good team, more in a simple design is already a crowd. In most cases the selection of two fonts is sufficient to achieve all purposes. There should be one style for headers and the other for block text. Links should be easily identifiable. Poorly chosen fonts via @font-face can be cumbersome to the eye when using antialiasing. The same thing happens if fonts are carelessly used in Flash. Blurred letters tire the eyes and slow reading. Also, bold, italic, caps and other highlighters should be used alone (bold italic already too much). The decision about highlighting the text parts and what is the rule of making them should be made at the beginning.

Reading long texts requires concentration. Very often the user’s attention is distracted by ads filling column on one or both sides of the article. If we want the user to read them to the end, the number of items that can be distracting should be limited.

THE ART OF TYPOGRAPHY AS A MARKETING TOOL – TYPOGRAPHY IN LOGOTYPES AND ITS ASSOCIATION WITH THE WEBSITE OR APP PROJECT

New patterns of the typography and lettering were developed mainly by expansion of commercial advertising. Visual identification containing posters, brochures, packaging or advertisements and the most important one – the name and the logo in addition to graphic contain text information. This text in order to fulfill its purpose should be communicative, clear and attractive, and also harmonized with the illustration and the nature of the product being advertised. Adequately formed trademark can reflect the nature and style of the organization. At the same time the positive qualities and associations symbol will be the basis for building a favorable image of the company. Each organization is unique and its identity should grow out of its roots and experiences. The identity of the company should be simple and clear because it makes up a pattern to “measure” products, behaviors and actions of the company. That’s why it should be visible and covering all of the aspects to be an affirmation. Typography is a part of company identity, that’s why it should reflect its character and values. Corporate materials should correspond with the company and its goals, this means that all of the from business cards, letterhead, the manuals of products, newspaper advertising on the website and multimedia presentations should have uniform character. Properly selected typeface with their variations and layouts carefully designed elements will create an image of the company stable and effective in operation.

Therefore, typographic elements on the website or web/mobile application should be consistent with guidelines in a company’s brand book or if the use of the designated font family is provided only for printing – at least formed in a coherent relation.

There are trademarks consisting of the logotype (text) and signs containing a symbol (graphical element) and logotype. The signs made only with the symbol are quite rare. If the trademark is only a company sign, it’s necessary to provide maximum readability in all sizes of letters. The logo should not impede the reading of the name, which sometimes happens when for example the first letter of the name is distinguished. Adhering to the principles of correct typography lets the company appear as a professional, trustworthy and reliable partner, which can only lead to success for both sides!

 

VECTOR GRAPHICS ON WEBSITES :)

SVG is a free and open format for vector graphics. It is based on XML and it is a language developed to represent any data, in a format that is possible to understand by human and by a computer. SVG uses the structure known from (X)HTML to represent data. Tags, elements and attributes, which can be used and understood by programs for editing vector graphics, are specified in the SVG specification.The format delivers such elements as lines or shapes (circles, triangles and other complicated shapes), which altogether go into the making of our graphics. It also supports gradients, transformation of elements (e.g.: rotation), effects and animations, and interactions with JavaScript.

Features

SVG is perfect for logos, charts or uncomplicated graphics, which can be modified or animated on a website.

The main features of SVG:

  • Scalability
    – changing the size of a graphic does not influence its quality – thanks to that, we do not have worry about the phenomenon called ‘pixelation’. It works perfectly while creating responsive websites (RWD) and on mobile devices with high resolution (density) screens.
  • Easy to modify
    – we can easily change the shapes of elements by changing coordinates, give them stroke colors and fillings, and animate them.
  • Lower file size
    – in many cases, vector images take a lot less space then the raster equivalents while maintaining the same quality.

Implementing the SVG on a website

A graphic that was made in a vector graphic design software (e.g.: Adobe Illustrator) can be easily written into the SVG format and implemented on a website.

SVG can be placed in the website code in several ways. One of them is to use the HTML IMG tag (just like any other image) or using CSS styles as a background.

1
2
<img src="vector_image.svg" />
1
2
3
4
#myvectoimage {
  background-image: url(vector_image.svg);
}

Another way (known from Flash) is to implement it as an object using OBJECT or EMBED tags.

1
2
3
4
5
6
<object type="image/svg+xml" data="vector_image.svg">SVG not supported</object>

<!-- or -->

<embed type="image/svg+xml" src="vector_image.svg" />

Another way is to add SVG content directly into the HTML code.

1
2
3
4
5
6
7
<body>
<!-- Your SVG code -->
<svg xmlns="http://www.w3.org/2000/svg" width="500px" height="500px">
...
</svg>
</body>

Which one of these is the best? It depends on the needs. The method of using IMG tags has restrictions. Because of the security policy, browsers do not allow to execute the JavaScript code of choosing styles for elements inside the SVG, which was implemented using this method. And the EMBED tag was never a part of HTML and XHTML specification, despite the fact that basically every browser supports it – that is why it is recommended to avoid this way of implementation. It seems that the best way to do this is to use the OBJECT tag because then we fully control the SVG (we are able to implement styles, animate, execute JavaScript scripts on elements) and there is a separate representing part regarding a structure.

Browsers support

The SVG technology is available for over 10 years. At first, it was supported by Internet Explorer 9. Currently, a vast majority of browsers fully support SVG, also on mobile devices.

 

ELASTICSEARCH GOTCHAS – PART 1 :)

Elasticsearch is a search engine that is based on a trusted and mature library – Apache Lucene. Huge activity in git project repository and the implementation in such projects as GitHub, SoundCloud, Stack Overflow and LinkedIn bear testimony to its great popularity. The part “Elastic” says it all about the nature of the system, whose capabilities are enormous: from a simple file search on a small scale, through knowledge discovery, to big data analysis in real time.What makes Elastic a more powerful than the competition is the set of default configurations and behaviors, which allow to create a cluster and start adding documents to the index in a couple of minutes. Elastic will configure a cluster for you, will define an index and define the types of fields for the first document obtained, and when you add another server, it will automatically deal with dividing index data between servers.Unfortunately, the above mentioned automation makes it unclear to us what the default settings implicate, and it often turns out to be misleading. This article starts a series where I will be dealing with the most popular gotchas, which you might encounter during the process of Elastic-based app creation.

The number of shards cannot be changed

Let’s index the first document using index API:

1
2
3
4
5
$ curl -XPUT 'http://localhost:9200/myindex/employee/1' -d '{
    "first_name" :   "Jane",
    "last_name" :    "Smith",
    "steet_number":  12
  }'

In this moment, Elastic creates an index for us, titled myindex. What is not visible here is the number of shards assigned to the index. Shards can be understood as individual processes responsible for indexing, storing and searching of some part of documents of a whole index. During the process of document indexing, elastic decides in which shard a document should be found. That is based on the following formula:

shard = hash(document_id) % number_of_primary_shards

It is now clear that the number of primary shards cannot be changed for an index that contains documents. So, before indexing the first document, always create an index manually, giving the number of shards, which you think is sufficient for a volume of indexed data:

1
2
3
4
5
$ curl -XPUT 'http://localhost:9200/myindex/' -d '{
    "settings" : {
      "number_of_shards" : 10
    }
  }'

Default value for number_of_shards is 5.
This means that the index can be scaled to up to 5 servers, which collect data during indexation. For the production environment, the value of shards should be set depending on the expected frequency of indexation and the size of documents. For development and testing environments, I recommend setting the value to 1 – why so? It will be explained in the next paragraph of this article.

Sorting the text search results with a relatively small number of documents

When we search for a document with a phrase:

1
2
3
4
5
6
7
8
$ curl -XGET 'http://localhost:9200/myindex/my_type/_search' -d
  '{
    "query": {
      "match": {
        "title": "The quick brown fox"
      }
    }
  }'

Elastic processes text search in few steps, simply speaking:

  1. phrase from request is converted into the same identical form as the document was indexed in, in our case it will be set of terms:
    [“quick”, “brown”, “fox”] (“the” is removed because it’s insignificant),
  2. the index is being browsed to search the documents that contain at least one of the searched words,
  3. every document that is a match, is evaluated in terms of being relevant to the search phrase,
  4. the results are sorted by the calculated relevance and the first page of results is returned to the user.

In the third step, the following values (among others) are taken into account:

  1. how many words from the search phrase are in the document
  2. how often a given word occurs in a document (TF – term frequency)
  3. whether and how often the matching words occur in other documents (IDF – inverse document frequency) – the more popular the word in other documents, the less significant
  4. how long is the document

The functioning of IDF is important to us. Elastic for performance reasons does not calculate this value regarding every document in the index – instead, every shard (index worker) calculates its local IDF and uses it for sorting. Therefore, during the index search with low number of documents we may obtain substantially different results depending on the number of shards in an index and document distribution.

Let’s imagine that we have 2 shards in an index; in the first one there are 8 documents indexed with the word “fox”, and in the second one only 2 documents with the same word. As a result, the word “fox” will differ significantly in both shards, and this may produce incorrect results. Therefore, an index consisting of only one primary shard should be created for development purposes:

1
2
$ curl -XPUT 'http://localhost:9200/myindex/' -d
  '{"settings" : { "number_of_shards" : 1 } }'

Viewing the results of “far” search pages kills your cluster

As I’ve written before in previous paragraphs, documents in an index are shared between totally individual index processes – shards. Every process is completely independent and deals only with the documents, which are assigned to it.

When we search an index with millions of documents and wait to obtain top 10 results, every shard must return its 10 best-matched results to the cluster’s node, which initiated the search. Then the responses from every shard are joined together and the top 10 search results are chosen (within the whole index). Such approach allows to efficiently distribute the search process between many servers.

Let’s imagine that our app allows viewing 50 results per page, without the restrictions regarding the number of pages that can be viewed by a user. Remember that our index consists of 10 primary shards (1 per server).

Let’s see how the acquiring of search results will look like for the 1st and the 100th page:

Page No. 1 of search results:

  1. The node which receives a query (controller) passes it on to 10 shards.
  2. Every shard returns its 50 best matching documents sorted by relevance.
  3. After the responses has been received from every shard, the controller merges the results (500 documents).
  4. Our results are the top 50 documents from the previous step.

Page No. 100 of search results:

  1. The node which receives a query (controller) passes it on to 10 shards.
  2. Every shard returns its 5000 best matching documents sorted by relevance.
  3. After receiving responses from every shard, the controller merges the results (50000 documents).
  4. Our results are the documents from the previous step positioned 4901 – 5000.

Assuming that one document is 1KB in size, in the second case it means that ~50MB of data must be sent and processed around the cluster, in order to view 100 results for one user.

It’s not hard to notice, that network traffic and index load increases significantly with each successive result page. That’s why it is not recommended to make the “far” search pages available to the user. If our index is well configured, than the user should find the result he’s interested in on the first search pages, and we’ll protect ourselves from unnecessary load of our cluster. To prove this rule, check, up to what number of search result pages do the most popular web search engines allow viewing.

What’s also interesting is the observation of browser response time for successive search result pages. For example, below you can find response times for individual search result pages in Google Search (the search term was “search engine”):

1
2
3
4
5
6
7
| Search result page (10 documents per page) | Response time |
|--------------------------------------------|---------------|
| 1                                          | 250ms         |
| 10                                         | 290ms         |
| 20                                         | 350ms         |
| 30                                         | 380ms         |
| 38(last one available)                     |               |

In the next part, I will look closer into the problems regarding document indexing.