Back to Course

0% Complete
0/0 Steps

Lesson 24 of 48
In Progress

# Suffix Trees in Java

##### Yasin Cakal

Suffix trees are an incredibly powerful data structure used to store and search strings. A suffix tree is a compact data structure for representing all possible suffixes of a given string. It is a tree-like structure that is a special type of trie, and it has many applications in data structures and algorithms.

Suffix trees are often used for search algorithms, text processing, and string manipulation. They are also used for indexing and other applications in computer science. Suffix trees are also known as PAT (Post-Trie) trees, Generalized Suffix Trees (GSTs), or PATRICIA trees.

In this article, we are going to discuss the various operations of a suffix tree and the time and space complexities associated with them. We will also look at some examples of how to implement a suffix tree using Java code.

## What is a Suffix Tree?

A suffix tree is a data structure that is used to store and search strings. It is a tree-like structure that is composed of nodes and edges. The tree is composed of all possible suffixes of a given string. Each leaf node of the tree contains one of the suffixes of the original string.

The suffix tree is a powerful data structure that can be used to store and search strings in linear time. It is a form of a trie, and it is often used for text processing and string manipulation.

Suffix trees are used for a variety of applications, including search algorithms, text processing, string manipulation, and indexing. They are also used for pattern matching, data compression, and text indexing.

## Time and Space Complexity of Suffix Trees

Suffix trees have a time complexity of O(n), where n is the length of the string. This means that it takes constant time to access, search, insert, and delete elements in the tree.

The space complexity of a suffix tree is O(n2). This means that it takes O(n2) space to store the suffix tree.

## Example of a Suffix Tree

Let’s look at an example of a suffix tree. Consider the string “algorithm”.

The following is the suffix tree for the string “algorithm”:

• algorithm
• algorith
• lgorithm
• gorithm
• gorithm
• gorithm
• rithm
• ithm
• thm
• hm
• m

As you can see, the suffix tree is composed of all the possible suffixes of the string “algorithm”.

## Java Implementation of a Suffix Tree

Now, let’s look at how to implement a suffix tree using Java.

The first step is to create the Node class. The Node class will represent the nodes of the suffix tree. The Node class should have the following properties:

• String data: This is the data stored in the node.
• List<Node> children: This is the list of child nodes.
• boolean isLeaf: This is a boolean that indicates whether the node is a leaf node or not.
public class Node {
String data;
List<Node> children;
boolean isLeaf;

public Node(String data) {
this.data = data;
this.children = new ArrayList<>();
this.isLeaf = false;
}
}

The next step is to create the SuffixTree class. The SuffixTree class will represent the suffix tree. The SuffixTree class should have the following properties:

• Node root: This is the root node of the tree.
public class SuffixTree {
Node root;

public SuffixTree() {
this.root = new Node("");
}
}


Now, let’s look at how to add a string to the suffix tree.

The addString() method takes a string as a parameter and adds it to the suffix tree. The addString() method should traverse the tree and create new nodes as needed.

public void addString(String str) {
Node currentNode = root;

for(int i=0; i<str.length(); i++) {
char c = str.charAt(i);

// Check if the character is already present in the tree
boolean isPresent = false;
for(Node child : currentNode.children) {
if (child.data.equals(Character.toString(c))) {
currentNode = child;
isPresent = true;
break;
}
}

// If the character is not present, create a new node
if (!isPresent) {
Node newNode = new Node(Character.toString(c));
currentNode = newNode;
}
}

// Mark the end of the string
currentNode.isLeaf = true;
}

## Conclusion

In this article, we discussed suffix trees and their various operations. We looked at the time and space complexities of the operations and also looked at an example of how to implement a suffix tree using Java.

Suffix trees are an incredibly useful data structure that can be used for text processing, search algorithms, and string manipulation. They have a linear time complexity and a quadratic space complexity, making them an efficient data structure.

## Exercises

#### Write a Java program to construct a suffix tree for the given string.

public class SuffixTree {
Node root;

public SuffixTree(String str) {
this.root = new Node("");
for (int i=0; i<str.length(); i++) {
}
}

Node currentNode = root;

for(int i=0; i<str.length(); i++) {
char c = str.charAt(i);

// Check if the character is already present in the tree
boolean isPresent = false;
for(Node child : currentNode.children) {
if (child.data.equals(Character.toString(c))) {
currentNode = child;
isPresent = true;
break;
}
}

// If the character is not present, create a new node
if (!isPresent) {
Node newNode = new Node(Character.toString(c));
currentNode = newNode;
}
}

// Mark the end of the string
currentNode.isLeaf = true;
}
}


#### Write a Java program to search for a given string in the suffix tree.

public boolean search(String str) {
Node currentNode = root;

for (int i=0; i<str.length(); i++) {
char c = str.charAt(i);

boolean isPresent = false;
for (Node child : currentNode.children) {
if (child.data.equals(Character.toString(c))) {
currentNode = child;
isPresent = true;
break;
}
}

if (!isPresent) {
return false;
}
}

if (currentNode.isLeaf) {
return true;
}

return false;
}

#### Write a Java program to insert a new string into the suffix tree.

public void insert(String str) {
Node currentNode = root;

for (int i=0; i<str.length(); i++) {
char c = str.charAt(i);

// Check if the character is already present in the tree
boolean isPresent = false;
for(Node child : currentNode.children) {
if (child.data.equals(Character.toString(c))) {
currentNode = child;
isPresent = true;
break;
}
}

// If the character is not present, create a new node
if (!isPresent) {
Node newNode = new Node(Character.toString(c));
currentNode = newNode;
}
}

// Mark the end of the string
currentNode.isLeaf = true;
}


#### Write a Java program to delete a given string from the suffix tree.

public void delete(String str) {
Node currentNode = root;

for (int i=0; i<str.length(); i++) {
char c = str.charAt(i);

boolean isPresent = false;
for (Node child : currentNode.children) {
if (child.data.equals(Character.toString(c))) {
currentNode = child;
isPresent = true;
break;
}
}

if (!isPresent) {
return;
}
}

if (currentNode.isLeaf) {
currentNode.isLeaf = false;
}
}

#### Write a Java program to find the longest common substring in the suffix tree.

public String longestCommonSubstring(String str1, String str2) {
Node currentNode = root;
String longestSubstring = "";

for (int i=0; i<str1.length() && i<str2.length(); i++) {
char c1 = str1.charAt(i);
char c2 = str2.charAt(i);

boolean isPresent = false;
for (Node child : currentNode.children) {
if (child.data.equals(Character.toString(c1)) && child.data.equals(Character.toString(c2))) {
currentNode = child;
longestSubstring += c1;
isPresent = true;
break;
}
}

if (!isPresent) {
break;
}
}

return longestSubstring;
}