Skip to content

DasunThathsara/User-defined-Regex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

User-Defined Regex

Implement Regex characters using the Knuth-Morris-Pratt (KMP) string matching algorithm. The solution was done using the Python programming language.



How to run?

First, go to the python folder and find the main.py file. This is the code file in this scenario. You can run it using your own test cases and open the results folder to view the outputs. If you want to add a new test case, go to the test cases folder and create your own test case file.


You can change this text file name to your own file name.

text = open("test cases/text.txt", 'r')
pattern = open("test cases/pattern.txt", 'r')
output = open("results/patternmatch.output", 'w')

regularExpression.py contains the Regex file, and the main file imports functions from the regularExpression.py file.

First, import regularExpression.py and set the alias to re. Because regularExpression is too long and hard to handle.

import regularExpression as re



Methodology

I use the Knuth-Morris-Pratt (KMP) algorithm to implement this and filter the pattern using simple if statements. Also, the matiching condition is called to the relevant function, and the output is given as a Boolean value.

def search(String, Pattern):
    # --------------------- Find Either Or relation --------------------
    if any(i == '|' for i in Pattern):
        return search(String, Pattern.split('|')[0]) or search(String, Pattern.split('|')[1])


    # ---- Find if String starts with Pattern and ends with Pattern ----
    elif any(i == '?' for i in Pattern):
        if Pattern[-1] == '?':
            return search(String, Pattern.split('?')[0]) or search(String, Pattern.split('?')[0][:len(Pattern.split('?')[0]) - 1])
        elif Pattern[0] == '?':
            return search(String, Pattern.split('?')[1]) or search(String, Pattern.split('?')[1][1:])
        else:
            return search(String, Pattern.split('?')[0][-1] + Pattern.split('?')[1][0]) or search(String, Pattern.split('?')[0][-1][:len(Pattern.split('?')[0][-1]) - 1] + Pattern.split('?')[1][0])


    # ---- Find if String starts with Pattern and ends with Pattern ----
    elif Pattern[0] == "^" and Pattern[-1] == "$":
        return kmp(String, "".join(Pattern[1: len(Pattern) - 1])) and len(String) == len(Pattern) - 2


    # --------------- Find if String starts with Pattern ---------------
    elif Pattern[0] == "^":
        return kmp(String[: len(Pattern) - 1], "".join(Pattern[1:]))


    # ---------------- Find if String ends with Pattern ----------------
    elif Pattern[-1] == "$":
        return kmp(String[len(String) - len(Pattern) + 1:], "".join(Pattern[0: len(Pattern) - 1])) and search(String, "".join(Pattern[:len(Pattern) - 1]))


    return kmp(String, Pattern)

In this version, the program recognizes ^ | $ ? as the regex character.



*If you want to clone and run this code, open the Python folder in your IDE and run. Please use Python 3.9.


License: MIT Hits Tests Passing

Releases

No releases published

Packages

No packages published

Languages