RSS
 
 

Bit coin miner from Ebay scrap D’oh!!!!(VII)

09 Jul

Its Always good to take a look at past work…

Inparticular the VHDL…. hay it looks ok so it MUST be ok…

FPGA




















When I say look at it, I mean from a totally different perspective…

The blue & red are the inner & outer SHA256 algorithms, the little square jobby in the red area is a constrained controller (constraining tells the tools that you would like to keep all that crap together)

Constraining is a way to control the routing inside the FPGA, by controlling the routing you control the delays between parts of the logic.
If you ‘over’ constrain then the router cannot do its job and connect all the logic together, so basically you build a design, go take a look at what is broken (outside the timing needed to fix the design) then if possible either re-design or constrain the broken logic….

In the case of the “yellow flecks mixed with the red.. This is supposed to be a RAM buffer…. but wait… Xilinx FPGA’s have blocks of RAM prebuilt for use.
The yellow is NOT IT.. that’s to say that the yellow flecks are not located where a RAM block should be located.
What the stupid compiler has done… it built RAM circuits from discrete logic…. and ignored the actual RAM.

fixing the ballsup

This is how the code is described in VHDL….
Hmmm… looks ok.

dist_inst : RAM16X1D
generic map (INT => X"0000")
PORT MAP (
DPO => dout(i),
SPO => OPEN,
A0 => wr_ptr(0),
A1 => wr_ptr(1),
A2 => wr_ptr(2),
A3 => wr_ptr(3),
D => din(i),
DPRA0 => rd_ptr(0),
DPRA1 => rd_ptr(1),
DPRA2 => rd_ptr(2),
DPRA3 => rd_ptr(3),
WCLK => clk,
WE => wr_en_int);

But it is not, this is the correct code….
dist_inst : RAM16X1D_1
generic map (INT => X"0000")
PORT MAP (
DPO => dout(i),
SPO => OPEN,
A0 => wr_ptr(0),
A1 => wr_ptr(1),
A2 => wr_ptr(2),
A3 => wr_ptr(3),
D => din(i),
DPRA0 => rd_ptr(0),
DPRA1 => rd_ptr(1),
DPRA2 => rd_ptr(2),
DPRA3 => rd_ptr(3),
WCLK => clk,
WE => wr_en_int);
/code>

The issue?
Whilst debugging way back.. many moons ago... I inferred a RAM plugin that implemented RAM cells as standard flip-flops... it was called "RAM16X1D", whereas the Xilinx RAM plugin was called "RAM16X1D_1"
Somehow the "_1" was deleted and ever since that time the Bitcoin code has been using individual flip-flops to implement the RAM... instead of the Block RAM so thoughtfully provided by the Xilinx FPGA.
So The problem is fixed ?
Well that would depend on exactly what it is we are trying to accomplish, are we trying to just stop Xilinx tools from inferring RAM from raw flip-flops or are we attempting to accomplish some larger goal?

The Goal
The plan for the RAM was actually two fold:
1. To provide a separation mechanism between the bitcoin generation core and the serial communication module.
Directly fixing the serial onto the back end of the generation core would mean that the whole design has to be clocked in such a fashion that the serial module is provided with a clock that meets one of the industry standard baud rates.
2. To implement a FIFO so that any generated bitcoin hashes could be queued.
sometimes hashes are generated in quick succession, so quick in fact that the old hash is erased/ corrupted before the serial routine has time to send the hash to the controlling computer, the hash is sent bit by bit by the serial system only to be replaced halfway thrugh the process by the following hash, the result is actually multiple corrupted hashes.

Behold meet FIFO36_72
The problem with VHDL or even Verilog, is that it does not understand WHAT you are trying to do, it can only infer from the design what it has been programmed to infer from your code and then guess at what you are trying to accomplish.
This is what separates the clowns from the circus master, the problem with VHDL et.al., is that they are portable between devices(which in not always a good thing)

The FPGA tools have been built for professional logic designers and as such have been designed to work the way the designers work, unlike a C++ compiler that has been honed to deal with some of the worst code on the planet.

Moving on
A quick check shows that there is no 'real' FIFO usage in our design, in-fact all the 'real' fifo's are sitting idle... (how do we define 'real' simple we pull the Xilinx HDL reference manual for the chip)

Using the Xilinx reference manuals and the core wizard to build a FIFO for the virtex 5, we then replacing the 'custom' shit:

COMPONENT fifo_generator_v9_2 IS
PORT (
-- Clock and reset
clk : IN std_logic;
rst : IN std_logic;
din : IN std_logic_vector(63 DOWNTO 0);
wr_en : IN std_logic;
rd_en : IN std_logic;
full : OUT std_logic;
empty : OUT std_logic;
dout : OUT std_logic_vector(63 DOWNTO 0)
);
END COMPONENT;

Follow this up with a quick allocation of the FPGA area using a 'Pblock", this ensures that the compiler ALWAYS puts the FIFO in the same place, thereby stabilizing this aspect of the design.

rev2



















Wow What Happened!!
By centralizing the logic into silicon that is specifically built to act as a FIFO, we free up a shit load of routing resources and some small sections of logic, and by constraining it all into a relatively located area next to the FIFO we radically change the design for the rest of the FPGA.
Net gain... 80% increase in the speed of the logic, NOT because the FIFO makes things go faster, but because it forces the router to constrain other parts of the design and reduces the overall fanout.
So where possible, it is imperative that we understand the device we are working with AND we utilize the primitives that are built INTO the silicon, rather than trying to implement the whole design in pure VHDL Et.Al.

Conclusion
Designing for Silicon is rather odd... you only waste resources if you do not use them, it is a bit like time, you get so much and any unused is just a waste.
Sometimes it pays to use a 72bit 500 deep FIFO, because of what it saves you... even if you only need to use a few of the FIFO locations..
Throwing in a FIFO can break up an over extended logic chain especially if it can be dual clocked.
It means you no longer need to ensure the whole design meets a single set of timings.

 

Leave a Reply

 
*