How to create a parser that separates numbers (not digits) in Haskell [closed]

I am trying to implement an algorithm that balances chemical reactions, given an input reaction (through two lines of I/O) of the following form:

XNXNXN XNXNXN ... XNXNXN

YNYNYN YNYNYN ... YNYNYN

where the first line is the list of reactants, and the second is the list of products. X represents an atom, and N represents a positive integer.

I am trying create a list whose output looks like this:

 N, N, N,   ... N N
-N, -N, -N, ... -N -N.

However, I don't know how to do this for integers instead of digits. I could try a parser using splitAt and

isElem x [1..], 

but the problem with that is that (with this implementation) I need the list to be evaluated in reverse order, which is impossible if the list is infinite, and computationally heavy if the list is long, due to the fact that lower numbers are more common in chemical equations.

Otherwise, I get spaces between the beginning and end of the number.

I would think a solution would be to have it update a boolean to false whenever it sees a character. Whenever it does so, it goes backwards until it hits another non-digit, concatenates into a string, and then reverses the list because it was going backwards. Then it applies the same function to the remaining list, terminating on an empty list.

The problem is that I only just started Haskell, and I have no idea how to implement this algorithm.


Solution 1:

Is that what you meant? (I hope you can figure out where to put minuses for the other variant)

import Data.Char

go :: [Int] -> Int -> String -> [Int]
go res n [] = reverse $ n:res
go res n (h:t)
  | h `elem` "1234567890"  = go res (n * 10 + ord h - ord '0') t
  | n == 0 = go res 0 t
  | otherwise  = go (n:res) 0 t

parse = go [] 0

The first argument is the total result that is accumulated over the computation. The second one is currently built integer. The third is the input string.

Whenever we see something that is not a number, we finish building the number by adding it into the accumulator and proceeding with a fresh built-integer set to 0. When the list is empty (meaning that we reached the end of the input), we return reversed accumulator. This is because we build it up on a stack which reverses the order. It does not contribute harshly to the performance, as the whole operation is linear anyway.

The main procedure just starts the logic with an empty accumulator and 0 as the currently parsed integer.