I am a student at university and our task is to create a search engine. I am having difficulty generating a unique id to assign to each url when added into the frontier. I have attempted using the SHA-256 hashing algorithm as well as Guid. Here is the code that i used to implement the guid:

public string generateID(string url_add)
{
    long i = 1;

    foreach (byte b in Guid.NewGuid().ToByteArray())
    {
        i *= ((int)b + 1);
    }

    string number = String.Format("{0:d9}", (DateTime.Now.Ticks / 10) % 1000000000);

    return number;
}

Why not just use ToString?

public string generateID()
{
    return Guid.NewGuid().ToString("N");
}

If you would like it to be based on a URL, you could simply do the following:

public string generateID(string sourceUrl)
{
    return string.Format("{0}_{1:N}", sourceUrl, Guid.NewGuid());
}

If you want to hide the URL, you could use some form of SHA1 on the sourceURL, but I'm not sure what that might achieve.


Why don't use GUID?

Guid guid = Guid.NewGuid();
string str = guid.ToString();

Here is a 'YouTube-video-id' like id generator e.g. "UcBKmq2XE5a"

StringBuilder builder = new StringBuilder();
Enumerable
   .Range(65, 26)
    .Select(e => ((char)e).ToString())
    .Concat(Enumerable.Range(97, 26).Select(e => ((char)e).ToString()))
    .Concat(Enumerable.Range(0, 10).Select(e => e.ToString()))
    .OrderBy(e => Guid.NewGuid())
    .Take(11)
    .ToList().ForEach(e => builder.Append(e));
string id = builder.ToString();

It creates random ids of size 11 characters. You can increase/decrease that as well, just change the parameter of Take method.

0.001% duplicates in 100 million.


Why can't we make a unique id as below.

We can use DateTime.Now.Ticks and Guid.NewGuid().ToString() to combine together and make a unique id.

As the DateTime.Now.Ticks is added, we can find out the Date and Time in seconds at which the unique id is created.

Please see the code.

var ticks = DateTime.Now.Ticks;
var guid = Guid.NewGuid().ToString();
var uniqueSessionId = ticks.ToString() +'-'+ guid; //guid created by combining ticks and guid

var datetime = new DateTime(ticks);//for checking purpose
var datetimenow = DateTime.Now;    //both these date times are different.

We can even take the part of ticks in unique id and check for the date and time later for future reference.