Order Picking in a Large-Scale Modern Warehouse – A HANA/R Use Case


Warehouse optimisation has always been critical to the success of retailers, but as retailers grow to export goods globally and expand sales through e-commerce, it becomes even more critical. There are many operational processes within a warehouse that can be optimised, including batching, zoning and warehouse layout. One area of importance to large warehouses, is warehouse picking efficiency – that is, how quickly one or more orders can be filled by a person or automated machine physically traversing the warehouse and picking items off the shelves.


Warehouse picking activities consume as much as 60% of labour activities in the warehouse. Thus, even a small improvement in the time taken to fill one order is of great benefit to businesses operating on a large-scale.

And speaking of large scale, consider Amazon’s many warehouses, called “fulfilment centres”. As at 2013, Amazon has over 27 fulfilment centres totalling over 24 million square feet.  Amazon not only sells its own products, but offers Fulfilment By Amazon (FBA) services, in which Amazon provides warehousing and logistics support for third party sellers that sell stock through Amazon or other sites such as eBay.


Clearly, there’s need for a big data storage solution to house huge amounts of product information and transactional data. On top of this, there is also a need for a smart and efficient analytics approach, i.e., one that produces a near-optimal route for the picker to follow in reasonable time. There are three types of efficiencies that we are concerned with here, outlined below.


Let’s talk first about why the solution must be smart. First, we talk about scalability of the physical warehouse.

The route taken by the picker through the warehouse is itself a sort of algorithm – it is a process that takes an input (one or more order lists), produces an output (a filled container), and describes a process to achieve this (a physical route that the picker must follow).  As a retail business grows, the number of orders to fill and number of items per order will grow, although the customer shipment time commitments will remain unchanged. Even when fulfilling a single order, the travel distance of an inefficient route may be much larger than an efficient route. When filling potentially tens of thousands of orders with thousands of items per order, the inefficiencies become unmanageable. Running the warehouse becomes extremely costly and not commitments can not be met.

On top of this, we know that as the warehouse grows in size, the physical length of the aisles and cross-aisles that must be traversed will grow, again magnifying the costs of route inefficiencies.


There is a need for the analytics solution that can handle a dynamic warehouse environment where the list of orders to be filled changes in real-time.  In a typical warehouse, new orders are constantly flowing in, the priority of order fulfilment and shipping may change, and products may become out-of-stock or re-stocked. The algorithm must be able to calculate the route quickly and frequently throughout a typical day.  The analytic route calculator is of course not a physical process, but a computational one, and so the algorithm driving it must also be efficient.


Finally, for each route calculation request, there is potentially a large transfer of data. The route calculator must have fast access to data structures storing the current order list and the current warehouse status (e.g. stock levels). Therefore our solution must allow efficient communication between the big data storage platform and the analytics engine.

Now, let’s talk about a HANA-based solution that addresses the need to house big data and the efficiency requirements of order picking within a modern, large-scale warehouse.


So we turn our attention to a readily available suite of technology, and propose a SAP HANA platform solution.

  • We need a service to accept near-streaming data (HTTP end-points, natively supported in HANA XS) and query external data sources (Smart Data Access, natively supported in HANA DB);
  • we need a customisable, integrated analytics and route calculation engine – R server, available through native HANA XS connectivity;
  • visualisation and client framework – UI5, optionally embedded in your Fiori Launchpad

With this combination and a little elbow grease, we can take inbound live orders and pass these to our R server for analytical processing. Once calculated (or recalculated), the route is passed back to HANA, where it is either displayed in a simple UI5 app to pickers, or converted to a set of instructions in an automated setting.

But first, we look at calculating an optimal route (the first efficiency), with an efficient algorithm (the second efficiency).


The analytics problem is to pick the items from one (or more) orders in a sequence so that the picker travels the minimum distance through the warehouse to complete this task .  We will call this “the pick order problem.  This problem consists of many instances, which each have slightly different assumptions. These assumptions may include the picker starting point, deposit point, number of cross-aisles and number of blocks in the warehouse.

The problem can be formulated as the classic Traveling Salesman Problem (TSP), where the Travelling Salesman must visit each city in a network of cities exactly once before returning to her home.  The optimal route travelled by the Salesman is the shortest route. We can convert our pick order problem to a TSP graph problem by representing the warehouse as a graph G, with vertices where items to be picked are (black vertices), and vertices where the cross-points between “aisles” and “cross-aisles” are (white vertices).


Figure taken from roodbergen.com

The problem is a bit different from the classic TSP in that not all vertices have to be visited (white nodes can be visited, but do not need to be. Black nodes must be visited).  Also, it is permissible to visit the black nodes more than once. This problem classifies as a Steiner TSP.  The graph algorithm problem is to find the minimum length tour that visits each black vertex once.


Possible algorithms can be classified into those that solve the problem exactly, and those that solve it heuristically. To solve the problem exactly means that the final solution will be guaranteed to be optimal (the minimal length tour). This is usually achieved by sequentially choosing the next step that is optimal, e.g. each time an item is selected to be picked next, it is always the best choice to pick from all the remaining items.

To solve the problem heuristically, means that the final solution will be pretty good, but not necessarily optimal. Problems are often solved heuristically when they cannot be solved exactly in reasonable time (for large cases). Heuristics can account for other factors too, such as aisle congestion. The heuristic means that when choosing the next step (selecting the next item to pick), a particular measure is used to guess which item will give us the optimal path.


Its clear, that it would be hard for SAP to create a one-size-fits all native solution for this use-case in its Predictive Analytics Package. Every warehouse has its own unique set of constraints, goals, and real-time time dynamics. The beauty of R is that it is an environment and a programming language – giving the developer the flexibility to create a customised analytics solution.  Moreover, because of its wide-use, and open-source nature, new packages are constantly added to the repository that can be utilised in a customised solution.

In R, one or more algorithms can perform the analytics. Ratliff and Rosenthal (1983) developed an algorithm that solves the minimum length tour problem exactly with time complexity that is linear to the number of aisles, although no packages are available for this algorithm.   Packages are available in R that interface to good TSP-solvers e.g. Concorde (http://www.math.uwaterloo.ca/tsp/concorde.html), although TSP-solvers are not guaranteed to be efficient. This will of course be a consideration when working with large warehouses.  Alternatively, an efficient heuristic-based approach can be customised for the client warehouse, and coded within R.

Finally, regarding the third efficiency – the need for fast communication between incoming order updates and route re-calculation and return, we  make mention of the fact that the HANA/R interface allows for speedy communication between data tables and the analytics processing the data.

Sandra Cutic