Convert 8bit binary number to BCD in VHDL

The algorithm is well known, you do 8 left shifts and check the units, tens or hundreds bits (4 each) after each shift. If they are above 4 you add 3 to the group and so on...

Here is a process based solution that does not work. It will compile but the output is not what I wanted. Any thoughts what could be the problem?

library ieee ;
use ieee.std_logic_1164.all ;
use ieee.std_logic_unsigned.all ;

entity hex2bcd is
    port ( hex_in  : in  std_logic_vector (7 downto 0) ;
           bcd_hun : out std_logic_vector (3 downto 0) ;
           bcd_ten : out std_logic_vector (3 downto 0) ;
           bcd_uni : out std_logic_vector (3 downto 0) ) ;
end hex2bcd ;

architecture arc_hex2bcd of hex2bcd is
begin
    process ( hex_in )
        variable hex_src : std_logic_vector (7 downto 0) ;
        variable bcd     : std_logic_vector (11 downto 0) ;
    begin
        hex_src := hex_in ;
        bcd     := (others => '0') ;

        for i in 0 to 7 loop
            bcd := bcd(11 downto 1) & hex_src(7) ; -- shift bcd + 1 new entry
            hex_src := hex_src(7 downto 1) & '0' ; -- shift src + pad with 0

            if bcd(3 downto 0) > "0100" then
                bcd(3 downto 0) := bcd(3 downto 0) + "0011" ;
            end if ;
            if bcd(7 downto 4) > "0100" then
                bcd(7 downto 4) := bcd(7 downto 4) + "0011" ;
            end if ;
            if bcd(11 downto 8) > "0100" then
                bcd(11 downto 8) := bcd(11 downto 8) + "0011" ;
            end if ;
        end loop ;

        bcd_hun <= bcd(11 downto 8) ;
        bcd_ten <= bcd(7  downto 4) ;
        bcd_uni <= bcd(3  downto 0) ;

    end process ;
end arc_hex2bcd ;

Solution 1:

The comments were getting too long.

Consider the following block diagram:

bin8bcd block diagram

This represents an unrolled loop (for i in 0 to 7 loop) and shows that no add +3 occurs before i = 2 for the LS BCD digit and no add +3 occurs before i = 5 for the middle BCD digit, and no adjustment occurs on the MS BCD digit, which is comprise in part of static '0' values.

This gives us a total of 7 add3 modules (represented by the enclosing if statement, and conditional add +3).

This is demonstrated in VHDL:

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity bin8bcd is
    port (
        bin:    in  std_logic_vector (7 downto 0);
        bcd:    out std_logic_vector (11 downto 0)
    );
end entity;

architecture struct of bin8bcd is
    procedure add3 (signal bin: in  std_logic_vector (3 downto 0); 
                    signal bcd: out std_logic_vector (3 downto 0)) is
    variable is_gt_4:  std_logic;
    begin
        is_gt_4 := bin(3) or (bin(2) and (bin(1) or bin(0)));

        if is_gt_4 = '1' then
        -- if to_integer(unsigned (bin)) > 4 then
            bcd <= std_logic_vector(unsigned(bin) + "0011");
        else
            bcd <= bin;
        end if;
    end procedure;

    signal U0bin,U1bin,U2bin,U3bin,U4bin,U5bin,U6bin:
                std_logic_vector (3 downto 0);

    signal U0bcd,U1bcd,U2bcd,U3bcd,U4bcd,U5bcd,U6bcd:
                std_logic_vector (3 downto 0);       
begin
    U0bin <= '0' & bin (7 downto 5);
    U1bin <= U0bcd(2 downto 0) & bin(4);
    U2bin <= U1bcd(2 downto 0) & bin(3);
    U3bin <= U2bcd(2 downto 0) & bin(2);
    U4bin <= U3bcd(2 downto 0) & bin(1);

    U5bin <= '0' & U0bcd(3) & U1bcd(3) & U2bcd(3);
    U6bin <= U5bcd(2 downto 0) & U3bcd(3);

U0: add3(U0bin,U0bcd);

U1: add3(U1bin,U1bcd);

U2: add3(U2bin,U2bcd);

U3: add3(U3bin,U3bcd);

U4: add3(U4bin,U4bcd);

U5: add3(U5bin,U5bcd);

U6: add3(U6bin,U6bcd);

OUTP:
    bcd <= '0' & '0' & U5bcd(3) & U6bcd & U4bcd & bin(0);

end architecture;

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity bin8bcd_tb is
end entity;

architecture foo of bin8bcd_tb is
    signal bin: std_logic_vector (7 downto 0) := (others => '0');
    -- (initialized to prevent those annoying metavalue reports)

    signal bcd: std_logic_vector (11 downto 0);

begin

DUT:
    entity work.bin8bcd
        port map (
            bin => bin,
            bcd => bcd
        );

STIMULUS:
    process

    begin
        for i in 0 to 255 loop
            bin <= std_logic_vector(to_unsigned(i,8));
            wait for 1 ns;
        end loop;
        wait for 1 ns;
        wait;
    end process;
end architecture;

That when the accompanying test bench is run yields:

bin8bcd_tb png

And if you were to scroll through the entire waveform you'd find that all bcd outputs from 001 to 255 are present and accounted for (no holes), no 'X's or 'U's anywhere.

From the representation in the block diagram showing i = 7 we see that no add +3 occurs after the final shift.

Also note that the LSB of bcd is always the LSB of bin, and that bcd(11) and bcd(10) are always '0'.

The add3 can be hand optimized to create an increment by 3 using logic operators to get rid of any possibility of reporting meta values derived from bin (and there'd be a lot of them).

As far as I can tell this represents the most optimized representation of 8 bit binary to 12 bit BCD conversion.

Sometime previously I wrote a C program to provide input to espresso (a term minimizer):

/*
 * binbcd.c   - generates input to espresso for 8 bit binary
 *         to 12 bit bcd.
 *
 */
#include <stdlib.h>
#include <stdio.h>


int main (argc, argv)
int argc;
char **argv;
{
int binary;
int bit;
char bcd_buff[4];
int digit;
int bcd;

    printf(".i 8\n");
    printf(".o 12\n");

    for (binary = 0; binary < 256; binary++)  {
        for ( bit = 7; bit >= 0; bit--) {
            if ((1 << bit) & binary)
                printf("1");
            else
                printf("0");
        }

        digit = snprintf(bcd_buff,4,"%03d",binary); /* leading zeros */

        if (digit != 3) {
            fprintf(stderr,"%s: binary to string conversion failure, digit = %d\n",
                argv[0],digit);
            exit (-1);
        }

        printf (" ");  /* input to output space */

        for ( digit = 0; digit <= 2; digit++) {
            bcd = bcd_buff[digit] - 0x30;
            for (bit = 3; bit >= 0; bit--) {
                if ((1 << bit) & bcd) 
                    printf("1");
                else
                    printf("0"); 
            }
        }
        /* printf(" %03d",binary); */
        printf("\n");
    }

    printf (".e\n");
    exit (0);

Then started poking around with intermediary terms, which leads you directly to what is represented in the block diagram above.

And of course you could use an actual component add3 as well as use nested generate statements to hook everything up.

You won't get the same minimized hardware from a loop statement representation without constraining the if statements (2 < i < 7 for the LS BCD digit, 5 < i < 7 for the middle BCD digit).

You'd want the subsidiary nested generate statement to provide the same constraints for a shortened structural representation.

A logic operator version of add3 is shown on PDF page 5 on the university lecture slides for Binary to BCD Conversion using double dabble, where the forward tick is used for negation notation, "+" signifies OR, and Adjacency signifies AND.

The add3 then looks like:

procedure add3 (signal bin: in  std_logic_vector (3 downto 0); 
                signal bcd: out std_logic_vector (3 downto 0)) is

begin

    bcd(3) <=  bin(3) or 
              (bin(2) and bin(0)) or 
              (bin(2) and bin(1));

    bcd(2) <= (bin(3) and bin(0)) or
              (bin(2) and not bin(1) and not bin(0));

    bcd(1) <= (bin(3) and not bin(0)) or
              (not bin(2) and bin(1)) or
              (bin(1) and bin(0));

    bcd(0) <= (bin(3) and not bin(0)) or
              (not bin(3) and not bin(2) and bin(0)) or
              (bin(2) and bin(1) and not bin(0));

end procedure;

Note this would allow package numeric_std (or equivalent) to be dropped from the context clause.

If you write signals in AND terms in the same order (in this case left to right) the duplicated AND terms show up well, as the also do using espresso. There is no value in using intermediary AND terms in an FPGA implementation, these all fit it LUTs just the way they are.

espresso input for add3: .i 4 .o 4 0000 0000 0001 0001 0010 0010 0011 0011 0100 0100 0101 1000 0110 1001 0111 1010 1000 1011 1001 1100 1010 ---- 1011 ---- 1100 ---- 1101 ---- 1110 ---- 1111 ---- .e

And espresso's output (espresso -eonset): .i 4 .o 4 .p 8 -100 0100 00-1 0001 --11 0010 -01- 0010 -110 1001 -1-1 1000 1--1 1100 1--0 1011 .e

When you consider the combinatorial 'depth' of the binary to BCD conversion, for an FPGA it's 6 LUTs (the 6th an input to something following). That likely limits the clock speed to something shy of 100 MHz if the conversion occurs in one clock.

By pipelining or using sequential logic (clocked loop) you'd be able to run an FPGA at it's fastest speed while executing in 6 clocks.