Comparing plain strings vs comparing codeUnits

You are using codeUnits method to compare strings in your solution to Challenge 2 of Chapter 4, why not to directly compare strings? performance issues?

In order to check whether the parentheses are balanced in a string like '(hello world', you need to check each character. The only way to do that in Dart that I know of is to look at the code units since Dart strings are strings of code units. Dart provides ready access to those code units with myString.codeUnits.

It would be perfectly valid to convert each code unit back to a string and then compare it to ( and ). However, if your original string is 1000 characters, then you have to do the character-to-string conversion 1000 times. It’s much simpler to just convert ( and ) to their respective code units once and then compare those to the other code units you already have. You’re unlikely to notice a performance difference between those two algorithms in most situations, but the second algorithm requires less work than the first one.

1 Like

I’ve implemented two methods (hasBalancedParentheses1, hasBalancedParentheses3) to compare them with your method (hasBalancedParentheses2)

import 'package:data_structures_algorithms/stack.dart';

const kPars = ['(',')'];

void hasBalancedParentheses1(String value, [bool showResult = true]) {
  var filtered = value.split('').where((e) => kPars.contains(e));
  var stack = Stack<String>();
  
  if (filtered.isEmpty || filtered.length.isOdd) {
    if (showResult) print(false);
    return;
  }
  
  var flag = true;
  for (var e in filtered) {
    if (e == kPars.first) { stack.push(e); } 
    else {
      if (stack.isNotEmpty) { stack.pop(); }
      else { 
        flag = false;
        break; 
      }
    }
  }

  if (showResult) print(stack.isEmpty && flag);
}

void hasBalancedParentheses2(String text, [bool showResult = true]) {
  var stack = Stack<int>();
  
  final open = kPars.first.codeUnitAt(0);
  final close = kPars.last.codeUnitAt(0);
  
  for (int codeUnit in text.codeUnits) {
    if (codeUnit == open) {
      stack.push(codeUnit);
    } else if (codeUnit == close) {
      if (stack.isEmpty) {
        if (showResult) print(false);
        return;
      } else {
        stack.pop();
      }
    }
  }
  if (showResult) print(stack.isEmpty);
}

void hasBalancedParentheses3(String text, [bool showResult = true, int idx = 0, Stack<bool>? stack]) {
  stack ??= Stack<bool>();
  
  if (idx >= text.length) {
    if (showResult) print(stack.isEmpty && idx == text.length);
    return;
  }

  final char = text[idx];
  if (char == kPars.first) stack.push(true);
  if (char == kPars.last) {
    stack.isNotEmpty ? stack.pop() : idx = text.length;
  }
  hasBalancedParentheses3(text, showResult, ++idx, stack);
}

I’ve run this code to compare run times for every method:

int measureTime(void Function() task, {bool simplePrint = false, bool showResult = true}) {
  final start = DateTime.now();
  task();
  final end = DateTime.now();
  final time = end.difference(start).inMicroseconds;
  if (showResult) {
    simplePrint ? print(time) : print('Task done at $time μs');
  }
  return time;
}

void main(List<String> args) {
  const word3 = 'h((e))llo(world)()h((e))llo(world)()h((e))llo(world)()h((e))llo(world)()(())))))))h((e))llo(world)()';
  const n = 10;

  for (var t = 0; t < n; t++) {
    var times = <int>[];
    for (var i = 0; i < n; i++) {
      times.add(measureTime(() => hasBalancedParentheses1(word3*(t+1), false), showResult: false));
    }
    print((times.reduce((a, b) => a + b) / n).toStringAsFixed(2));
    times.clear();
  }
}

Evidently I’ve changed hasBalancedParentheses1 in each run by the next method. I placed the average results in a table:

|      | M1    | M2    | M3    |
| ---- | ----- | ----- | ----- |
| 100  | 855.9 | 319.4 | 173.3 |
| 200  | 227.8 | 51.7  | 47.1  |
| 300  | 161.5 | 36.3  | 31.1  |
| 400  | 196.3 | 37.3  | 35.4  |
| 500  | 163.3 | 35.2  | 42.1  |
| 600  | 108.1 | 36.8  | 32    |
| 700  | 96.9  | 61.2  | 20.6  |
| 800  | 107.4 | 21.4  | 20.8  |
| 900  | 116.7 | 21.9  | 30.3  |
| 1000 | 127   | 31.1  | 20.2  |

It seems that using recursion is slightly more efficient.

This is great! I love seeing a data driven analysis like this. I’ll need to study this more closely in the future, especially the recursive version. Since recursion itself uses a stack internally, that makes me wonder if we could implement a recursive solution without our Stack.

TODO: Note to self: Consider linking to this in the solutions of the next edition.

According your suggestion I’ve created a new recursive method:

void hasBalancedParentheses4(String text, [bool showResult = true, int idx = 0, int stack = 0]) {
  if (idx >= text.length) {
    if (showResult) print(stack == 0 && idx == text.length);
    return;
  }

  final char = text[idx];
  if (char == kPars.first) stack++;
  if (char == kPars.last) {
    stack > 0 ? stack-- : idx = text.length;
  }
  hasBalancedParentheses4(text, showResult, ++idx, stack);
}

After comparing run times, I don’t know what to think, see these two graphics:

runtimes

Where: M? = hasBalancedParentheses?

Why are averages run times different with the same data tested and same CPU?

I also don’t know why the run times are different. Here are a couple guesses:

  • Other running processes on the system are affecting the outcome.
  • The Dart VM gets “smarter” over time as it tries to optimize how it runs. Maybe it is choosing a different internal algorithm to run the code.