Results 1 to 2 of 2

Thread: List<String> vs String[]

  1. #1
    Junior Member
    Join Date
    Mar 2014
    Posts
    18

    Default List<String> vs String[]

    Hi dear STSdb develoeprs.

    I need an advise.

    My key is of type integer.

    My record type is a list of integers. Currently I'm using List<String>, I'd like to know is there any noticable performance difference to use String[]?

    The records count may extends beyond 100,000,000.

    My other question is that, I get a rather long response time for reading random keys.

    As an example with a 500K records, fetching the 1000 random keys takes me 4 seconds !!! (5400 RPM Disk, Core I5 Mobile, DDR3 1300)

    Is that timing normal?

    Any optimization?

  2. #2

    Default

    There aren't any significant performance differences when using List<string> or string[]. This can be seen from the following examples.

    The first scenario is with integer type for the keys and List<string> type for the records:
    using (IStorageEngine engine = STSdb.FromMemory())
    {
            ITable<int, List<string>> table = engine.OpenXTable<int, List<string>>("Table");
    
             //Some actions with table
    }
    
    The persist code for the records looks like:
    void Write(BinaryWriter writer, IData idata)
    {
        List<string> dataValue = ((Data<List<string>>)idata).Value;
    
        CountCompression.Serialize(writer, (ulong)dataValue.Count);
    
        for (int i = 0; i < dataValue.Count; i++)
        {
            if (dataValue[i] != null)
            {
                writer.Write(true);
                writer.Write(dataValue[i]);
            }
            else
                writer.Write(false);
        }
    }
    
    IData Read(BinaryReader reader)
    {
        int count = (int)CountCompression.Deserialize(reader);
    
        List<string> dataValue = new List<string>(count);
    
        for (int i = 0; i < count; i++)
        {
            dataValue.Add(reader.ReadBoolean() ? reader.ReadString() : null);
        }
    
        return new Data<List<string>>(dataValue);
    }
    

    The second scenario is with integer type for the keys and string[] type for the records:
    using (IStorageEngine engine = STSdb.FromMemory())
    {
        ITable<int, string[]> table = engine.OpenXTable<int, string[]>("Table");
    
        //Some actions with table
    }
    
    The persist code for the records looks like:
    void Write(BinaryWriter writer, IData idata)
    {
        string[] dataValue = ((Data<string[]>)idata).Value;
    
        CountCompression.Serialize(writer, (ulong)dataValue.Length);
    
        for (int i = 0; i < dataValue.Length; i++)
        {
            if (dataValue[i] != null)
            {
                writer.Write(true);
                writer.Write(dataValue[i]);
            }
            else
                writer.Write(false);
        }
    }
    
    IData Read(BinaryReader reader)
    {
        int length = (int)CountCompression.Deserialize(reader);
    
        string[] dataValue = new string[length];
    
        for (int i = 0; i < length; i++)
        {
            dataValue[i] = reader.ReadBoolean() ? reader.ReadString() : null;
        }
    
        return new Data<string[]>(dataValue);
    }
    

    The serialization and deserialization logic may be changed with custom persist implementations that are more appropriate for the used types.

    For your second question - the execution time isn’t normal. With your current hardware, that time should be less. We have ran a series of tests with closely similar hardware and the execution time was ~600ms.

    The used test code was the following:
    Random rand = new Random();
    Stopwatch time = new Stopwatch();
    
    using (IStorageEngine engine = STSdb.FromFile("Database.stsdb4"))
    {
        ITable<int, string[]> table = engine.OpenXTable<int, string[]>("Table");
    
        table.Descriptor.RecordPersist = new StringArrayPersist();
                    
        List<int> randomKeyForSeach = new List<int>(1000);
        int countForRandomKeys = 0;
    
        for (int i = 0; i < 500000; i++)
        {
            var nextKey =rand.Next();
    
            string[] array = new string[100];
    
            for (int j = 0; j < 100; j++)
                array[j] = rand.Next().ToString();
    
                if (i % 7 == 0 && countForRandomKeys < 1000)
                {
                    randomKeyForSeach.Add(nextKey);
                    countForRandomKeys++;
                }
    
            table[nextKey] = array;
        }
    
        time.Start();
        for (int i = 0; i < countForRandomKeys; i++)
        {
            string[] takenKey;
            table.TryGet(randomKeyForSeach[i],out takenKey);
        }
        
        time.Stop();
        Console.WriteLine(time.ElapsedMilliseconds);
    }
    
    You can share your code and we will try to help you to optimize it.

    Below are the concrete hardware specifications of the computer used for the tests:

    CPU312321.pngMemory21312312.png

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
2002 - 2014 STS Soft SC. All Rights reserved.
STSdb, Waterfall Tree and WTree are registered trademarks of STS Soft SC.