As explained in How does Cassandra store data? – A simple explanation Cassandra uses partitioning key (the first part of primary key) to store a record into a partition on a node. Prior to version 1.2, one server = one node. That is each machine was assigned to a range of values, so that the entire cluster covered the possible hashed values from 0 – 2^127-1. And the range of values on a machine was determined by only one token. This however caused ‘Hot spots’. This also meant that adding and/or removing physical machines would require entire scope of assignments changed and reshuffled.
Cassandra 1.2 introduced the option of virtual nodes, one server = many virtual nodes. Now, instead of segmenting data into token ranges to assign to one machine, the range of possible data is split up into many, many smaller tokens. Each smaller token corresponding to one virtual node. Each machine gets many smaller tokens. Within a cluster these can be randomly selected and be non-contiguous, giving us many smaller ranges that belong to each machine.