Introduction
As a former student of a Data Science Master Degree, most of my courses were a mix of theory and practice, kind of on the style of practice what you preach (no, not in Barry White’s style, but you got the idea). The theory was followed by practical examples and a final project that took more than a couple of hours to be finished, collaboration with other student colleagues (communication, communication and again, communication, such an important skill! - just to mention one of the book in my book backlog list: The Art of Communicating) and lots of Stackoverflow posts.
Machine Learning, Deep Learning, Data Management, Data Visualizations, Programming, … but no Data Structure and Algorithm courses. Coming from a Business bachelor degree, with a focus on statistics (and R as a programming language), this has meant a lot of self-learning. Websites such as FreeCodeCamp, Stackoverflow (yes, again), W3Schools, Geeks for Geeks, The Real Python, the Python documentation, … were essential for getting started, to grasp the main concepts and to apply this new knowledge to given problems.
A few years later, I found myself happily working as a Data Engineer in a big company. I’ve already had quite a few years of experience programming in Python, both for work and for my personal side projects (such as an Expense Tracker), but I felt that something was missing.
That something was a deeper understanding of the fundamental structure of the programming languages. In this case, I’ll be considering Python.
A few months ago I decided to dive deeper into Data Structures and Algorithms, and I started solving problems on LeetCode.
My learning workflow usually looks like this:
- Check which concept to learn on Roadmap.sh
- Watch a few explanations on YouTube
- Try related problems on LeetCode
- Write pseudocode first
- Implement the solution in VSCode
- Iterate until it passes the tests
I would say that, so far, I’ve crunched some topics: I feel more confident and I’ve learned along the way about linked lists, hashmaps, arrays, stacks, …
Which brings me to where I am now. With this blog post I want to talk and go through one of the last challenges that I’ve solved with stacks: I’m talking about the 224. Basic Calculator challenge, a hard one.
One way to solve this problem is to evaluate the expression directly using stacks.
In this post, however, I explore a different route: converting the infix expression into Reverse Polish Notation (postfix) and then evaluating it.
What is the idea behind?
The whole concept relies on the RPN (Reverse Polish Notation) mathematical system, invented by Jan Łukasiewicz (a Polish logician and philosopher) in 1920:
How does it work?
In the RPN framework, the operators follow their operands. For example, to add 3 and 5, instead of writing:
3+ 5
One would write:
3 5 +
As Wikipedia states:
The advantage of reverse Polish notation is that it removes the need for order of operations and parentheses that are required by infix notation and can be evaluated linearly, left-to-right. For example, the infix expression:
(3 + 4) × (5 + 6)
becomes :
3 4 + 5 6 + × in reverse Polish notation. https://en.wikipedia.org/wiki/Reverse_Polish_notation
To evaluate the RPN, you read left to right and use a stack by using the following two concepts:
- Push numbers onto the stack.
- When you see an operator, pop the top two numbers, apply the operation, and push the result back.
Step 1
Read 3 → push
Stack: [3]
Step 2
Read 4 → push
Stack: [3, 4]
Step 3
Read + → compute 3 + 4 = 7
Stack: [7]
Step 4
Read 5 → push
Stack: [7, 5]
Step 5
Read 6 → push
Stack: [7, 5, 6]
Step 6
Read + → compute 5 + 6 = 11
Stack: [7, 11]
Step 7
Read × → compute 7 × 11 = 77
Stack: [77]
Final result:
77!
Well, the whole idea is based on stacks, particularly on the LIFO (Last In First Out) principle. The first conversion is from the infix to the postfix notation: after that, the scanning happens from left-to-right and operands are stacked one after the other followed by the operators, as shown above. When an operator is encountered, the algorithm pops the two most recent operands from the stack, applies the operation, and pushes the result back onto the stack.
The Code
The code comprises the following methods:
from_infix_to_postfix_notation: for converting the expression from the infix to postfix notationmerge_consecutive_value: to merge in the expression consecutive values that should be considered as onefix_unary2binary_operators: in case a unary operator appears, a simple trick is to add, for addition and subtraction unary operator, a 0 in front of themevalRPN: the actual method that will evaluate the expressioncalculate: the wrapping method to call all of the others
For the code, feel free to check it out in my GitHub repo!