most efficient way to find all the anagrams of each word in a list

  anagram, c++, performance, python

I have been trying to create a program that can find all the anagrams(in the list) for each word in the text file (which contain about ~370k words seperated by ‘n’).

I’ve already written the code in python. And it took me about an hour to run. And was just wondering if there is a more efficient way of doing it.

My code

from tqdm.auto import tqdm

ls = open("words.txt","r").readlines()
ls = [i[:-1] for i in ls]
ls = [[i,''.join(sorted(i))] for i in ls]
ln = set([len(i[1]) for i in tqdm(ls)])

df = {}
for l in tqdm(ln):
    df[l] = [i for i in ls if len(i[0]) == l]

full = {}
for m in  tqdm(ls):
    if full.get(m[0]) == None:
        temp = []
        for i in df[len(m[0])]:
            if i[1] == m[1] and i[0] != m[0]:
                temp.append(i[0])
        for i in temp:
            full[i] = temp

if there are more efficient ways of writing this in other languages (Rust, C, C++, Java …) It would be really helpful if you can also post that :)

Source: Windows Questions C++

LEAVE A COMMENT