前缀树简介

又称单词查找树，字典树，Trie树，是一种树形结构，是一种哈希树的变种。

典型应用是用于统计，排序和保存大量的字符串（但不仅限于字符串），所以经常被搜索引擎系统用于文本词频统计。

它的优点是：利用字符串的公共前缀来减少查询时间，最大限度地减少无谓的字符串比较，查询效率比哈希树高。

是一种典型的用时间换空间的数据结构

前缀树的性质：

根节点不包含字符，除根节点外每一个节点都只包含一个字符
从根节点到某一节点，路径上经过的字符连接起来，为该节点对应的字符串
每个节点的所有子节点包含的字符都不相同

假设有b，abc，abd，bcd，abcd，efg，hii 这6个单词,那我们创建trie树就得到

20190924193658133

其中红色的节点表示，从根节点到该节点连成的字符串，是前缀树包含的。

Java实现

定义前缀树节点的数据结构

class Node{
    Map<Character,Node> child;
    boolean isEnd;
    public Node(){
        child = new HashMap<>();
        isEnd = false;
    }
}

用Map来存储该节点的孩子节点

布尔类型的isEnd很重要，前缀树包含某一字符串，但该前缀树不一定存入该字符串。【如上图所示的红色节点】

它标识以该字符为结尾的字符串是存入的字符串。

例如：

1 2	tire.insert("apple"); tire.search("app");//false

前缀树的实现

字符串插入

public void insert(String word) {
    Node node = root;//指针指向根节点,从前缀树的根节点开始遍历
    for (int i = 0; i < word.length(); i++) {//遍历该字符串
        char c = word.charAt(i);
        if(!node.child.containsKey(c)){//如果该节点的孩子节点不包含该字符
            node.child.put(c,new Node());//新加入该字符
        }
        node = node.child.get(c);//指针指向孩子节点。
    }
    node.child.get(c).isEnd=true;//当到了最后的一个字符时，就标记为一个字符串的结尾。
}

字符串查找

public boolean search(String word) {
    Node node = root;//指针指向根节点
    for (int i = 0; i < word.length(); i++) {//遍历该字符串
        char c = word.charAt(i);
        //如果存在某个节点的孩子不包含该字符串，就说明前缀树中不存在该字符串
        if(!node.child.containsKey(c))return false;
        node = node.child.get(c);//指针指向孩子节点。
    }
    return node.isEnd;//都包含的还不行，还得看看该节点是否标记为字符串的结束标志
}

前缀树是否包含字符串的前缀

和字符串代码唯一不同的是：

如果包含该字符串，最后return的时候，不会判断该节点是否标记为字符串结束标志。

public boolean startsWith(String prefix) {
    Node node = root;
    for (int i = 0; i < prefix.length(); i++) {
        char c = prefix.charAt(i);
        if(!node.child.containsKey(c))return false;
        node = node.child.get(c);
    }
    return true;
}

找到以某字符串为前缀的字符串

例如：

前缀树包含的字符串有，apple、code、appl、leet、apples、abc、ace

找到以app为前缀的字符们：apple、appl、apples

public List<String> searchprefix(String word) {
    cnt = 0;   //前缀字符串数量
    List<String> res = new ArrayList<>();
    Node node = root;		//指针指向根节点
    for (int i = 0; i < word.length();i++) {
        char c = word.charAt(i);
        if (!node.child.containsKey(c)) return res;//如果不包含该前缀字符串直接return res
        node = node.child.get(c);//
    }
    //找到前缀之后，以前缀末尾字符的节点开始进行深度优先遍历。
    dfs(word, node, res, "");
    return res;
}
public void dfs(String word, Node node, List<String> res, String path) {  // 深度优先搜索
    if (cnt >= COUNT) return;//COUNT为要查找的前缀字符串数量
    if (node.isEnd && !word.equals(word + path)) {
        res.add(word + path);
        cnt++;
    }
    for (Map.Entry<Character, Node> entry : node.child.entrySet()) {
        node = entry.getValue();
        path = path + entry.getKey();
        dfs(word, node, res, path);
        path = path.substring(0, path.length() - 1);
    }
}

完整Java代码

class Trie {

    final int COUNT = 10;  // 提示内容的最大数目
    int cnt = 0;
    private static class Node{
        Map<Character,Node> child;
        boolean isEnd;
        public Node(){
            child = new HashMap<>();
            isEnd = false;
        }
    }

    private final Node root;
    public Trie() {
        root = new Node();
    }

    public void insert(String word) {
        Node node = root;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            if(!node.child.containsKey(c)){
                node.child.put(c,new Node());
            }
            node = node.child.get(c);
        }
        node.child.get(c).isEnd=true;
    }

    public boolean search(String word) {
        Node node = root;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            if(!node.child.containsKey(c))return false;
            node = node.child.get(c);
        }
        return node.isEnd;
    }

    public boolean startsWith(String prefix) {
        Node node = root;
        for (int i = 0; i < prefix.length(); i++) {
            char c = prefix.charAt(i);
            if(!node.child.containsKey(c))return false;
            node = node.child.get(c);
        }
        return true;
    }
    public List<String> searchprefix(String word) {
        cnt = 0;
        List<String> res = new ArrayList<>();
        Node node = root;    
        for (int i = 0; i < word.length();i++) {
            char c = word.charAt(i);
            if (!node.child.containsKey(c)) return res;
            node = node.child.get(c);
        }
        dfs(word, node, res, "");
        return res;
    }
    public void dfs(String word, Node node, List<String> res, String path) {  
        if (cnt >= COUNT) return;
        if (node.isEnd && !word.equals(word + path)) {
            res.add(word + path);
            cnt++;
        }
        for (Map.Entry<Character, Node> entry : node.child.entrySet()) {
            node = entry.getValue();
            path = path + entry.getKey();
            dfs(word, node, res, path);
            path = path.substring(0, path.length() - 1);
        }
    }
}

__END__