Article 31 - Programming Data Structures

There are certain data structures that every programmer should know such as Array, List, Stack, Heap, Queue, and Tree (Binary).

Array

An array is a fixed block of memory that is can be used for constant time O(1) lookup, insertion, and deletion. If the array is unsorted, search operations require O(N) time; if the array is sorted, then we can apply a binary search approach which takes O(logN) time. Each element of the array is mapped using an index, often a memory address.

Unfourtunately, arrays cannot increase in size unless a larger block of memory is reallocated for it.

Array Illustration.

List

There are many forms of a List, but the most common ones are ArrayList and LinkedList, which also come in many forms. We will not discuss about the ArrayList, since it's just a dynamically growing array.

A LinkedList also comes in many flavours, singly-linked or doubly-linked are the primarly types.

A singly-linked list contains a bunch of nodes that points to the next node. So given the head node, one can traverse the entire list.

A doubly-linked list contains a bunch of nodes that points to the both the previous and next nodes. Given any node, one can traverse each node in the list.

The common functions in a List are: next(), previous() (doubly-linked), add(), and remove().

Search operations require O(N) because each node is in a distinct memory location. However, insertion is O(N), just insert at the head, unless some explicit ordering is required. Deletion is still O(N) because a search is required to find the node for deletion.

LinkedList Illustration.

Stack

A Stack is a data structure that implements a last-in-first-out (LIFO) structure. The last element that is pushed onto the Stack will be the next in line to be popped off the Stack.

The common functions in a Stack are: pop(), push(), and empty().

Stack Illustration.

Heap

A Heap is a specialized tree structure such that the maximal or minimal values is the root node. That is all other nodes below some node contains values that are either less/greater than or equal to the current node.

It allows for retrieving max/min values easily. But insertion or deletion of a node is O(logN) expected time (balanced tree) or O(N) worst case (deletion only) such as finding the smallest node in a maximal heap.

Heap Illustration.

Hash Table

Hash Table is a structure that allows for constant time insertion, deletion, and search but there are no known methods that also support constant time range queries (in addition to the other functionalities).

It works by generating a unique hash value for a particular key that we want to insert into the table. Normally, the hash value is some value modulated by a prime number. Thus, for most keys, we should be hashing into a different buckets. There are ways to deal with collisions such as probing (finding the next available space by hashing the hashed value repeatedly) and chaining (using a LinkedList for collisions).

This data structure may consume a lot more memory than other data structures because more space is preallocated, otherwise collisions may occur. A known form of growing Hash Table, known as Extendible Hash Table, can dynamically allocate/deallocate memory for more and more insertions and deletions.

In C++, the Map structure in the standard library is similar to a Hash Table but it is implemented as a Tree structure. It guarantees O(logN) operations including range queries. Although, this implementation does not satisfy constant time operations in general, it is fast enough to be excused (unless, of course, N is very large).

Queue

A queue is a data structure that implements a first-in-first-out (FIFO) structure. The first element that is queued into the Queue will be the next in line to be dequeued from the Queue.

The common functions in a Queue are: queue(), dequeue(), and empty().

Tree

There are many variants of a tree including binary trees, or n-nary trees (B+ trees for example).

A binary tree is simply a bunch of nodes, where each node has zero, one, or two child nodes. A node with no child is considered a leaf node, otherwise they are known as internal nodes. The root node is a special node, it is the only node that is not a child of any other node.

Comments (21)

Posted by anonymous - generic viagra online at Friday, March 15, 2013 12:02 PM

Thank you ever so for you blog article.Really looking forward to read more. Really Cool.

Posted by anonymous - buy generic cialis at Friday, March 15, 2013 1:41 PM

Appreciate you sharing, great article.Much thanks again. Much obliged.

Posted by anonymous - 10mg levitra at Friday, March 15, 2013 3:21 PM

Major thanks for the blog article.Really looking forward to read more.

Posted by anonymous - Nice at Tuesday, October 22, 2013 3:29 AM

Very nic

Posted by anonymous - N at Friday, May 12, 2017 6:17 PM

Where's the hashtable implementation?

Post a comment

  • Name:
  • Post:
  • Challenge:

Register or login to post comments easier.


Vantasy World Copyright 2011 - 2023. Vantasy World is a project developed by Vantasy Online. Privacy Policy.