Read time
 min read

Goodhart’s Law and the Pitfalls of Targeting Load Port Utilisation on Photo Tools

load port utilisation on photo tools - photolithography area

It has been described as the law that rules the modern world, and its effects can be observed in every organisation. I’m referring to Goodhart’s law, named after British economist Charles Goodhart, who wrote the maxim: “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.”

A common flavour of this effect is described in the following cartoon, based on a possibly apocryphal story of how central planning failed in a nail factory in the Soviet Union.


We have seen (less dramatic) examples of this effect at work in semiconductor wafer fabs. For instance, teams of operators may be measured on the number of lot moves that occur during their shift. In general, more moves per shift correlates with more wafers delivered on time to customers. However, this relationship breaks down if operators ‘game the system’ by loading batch tools with small batches at the end of the shift, thus wringing out a few extra moves in their shift, but hobbling the next shift.

Memorable though such examples are, they give the impression that Goodhart’s Law relies on people being uninterested in the ultimate goal that their organisation is pursuing. However, apathy is not usually the driving factor in Goodhart’s law; whenever lack of information, limited computational power or even an inability to concisely express our true preferences leads us to substitute a proxy metric for our true goal, the law is bound to rear its head. Former Intel CEO Andy Grove described the effect of such surrogate indicators as like “riding a bicycle: you will probably steer where you are looking”; and if where you’re looking isn’t perfectly correlated with the road ahead, you can expect a wobbly ride!

The intricacies of tools with multiple load ports

For a more subtle example of where using an imperfect measure as a target can lead to suboptimalities when scheduling a wafer fab, we were inspired by a post on the excellent Factory Physics and Automation blog looking at the relationship between load port utilisation and cycle time. In our experience, we have seen load port utilisation of a tool used as a target when designing both operator workflows and dispatching rules.

First, some quick definitions. Many tools in a fab have multiple ‘load ports’ where lots can be inserted into the tool, but then a limited chamber capacity so that, for instance, only one wafer can be processed in the chamber at the same time.

Figure 1: Example of a tool with 3 chambers and two load ports.


Consider the machine in Fig. 1 with three chambers and two load ports. Lots can be loaded in either load port, but then each wafer in the lot has to move through Chambers A, B and C one at a time. This means wafers may have to queue inside the tool, if the next chamber they need is still processing.  Lots must be unloaded at the same load port in which they were inserted. Suppose it takes each chamber 10 minutes to process a wafer, and we want to process two lots each consisting of three wafers. If we were only allowed to use a single load port, we would have to wait for the first lot to move through all three chambers and exit at the same load port before we can start processing the second lot. Fig.2 shows that for a simple model (that ignores transfer time between chambers), the second lot will have to wait 50 minutes before it can start processing.

Figure 2: Example of how the tool from Figure 1 would process two 3-wafer lots if only load port 1 were being utilised.


If however, an operator loads both batches into the two load ports at the same time (Fig. 3), the machine will pick up the first wafer of the second lot as soon as the first lot has finished processing in chamber A. Thus the second lot will only need to wait 30 minutes.

Figure 3: Same situation as Fig. 2 is shown, except in this case both load ports are available for use. Therefore, once all three wafers of Lot 1 have finished processing in Chamber A, Lot 2 can begin processing.  


Therefore, for a given level of WIP at a tool, we can expect higher load port utilisation to be correlated with reduced waiting and therefore improved cycle time.

Indeed, in cases where a wafer cannot be unloaded from a tool until all the wafers in the same lot are also ready to be unloaded (a common workflow), it can actually make sense to split lots before a chamber tool. For instance, if we have a lot of 6 wafers before the tool (see Fig. 1) – loading all the wafers as a single lot in a load port – it will take 80 minutes for all 6 wafers to move through the three chambers until we can unload the lot. If however, we split the original lot into two lots of three and load them into both load ports (as in Fig. 3), then the first lot can be unloaded after just 50 minutes, and potentially continue to its next step earlier.

How directly targeting load port utilisation can harm cycle time

As predicted by Goodhart’s Law, the correlation between load port utilisation and fab cycle time breaks down once we try to optimize directly for load port utilisation. This breakdown is particularly stark on photolithography tools, where process steps rely on a critical secondary resource: reticles. Reticles (also called photomasks) act like stencils in the expose step of a photolithography process, patterning the wafer with the desired features.  In most photo tools, reticles must be loaded onto the tool in containers, called pods, before the lots that require them can be loaded onto the machine. Therefore, if a lot is inserted into a load port early, the wafers could just be waiting inside the machine. Moreover, this also requires loading a reticle into the machine when it could have a more productive use elsewhere.

For a simple example, consider a toolset consisting of two of the tools from Fig. 1 (we can imagine chambers A, B and C are performing coat, expose and develop operations respectively).

Suppose we have just loaded a 3 wafer lot onto tool 1. The other load port of tool 1 remains free. Meanwhile on tool 2, both load ports are utilised, but there are only two wafers yet to be processed in Chamber A.

A lot (lot X) that requires a special reticle (of which only one exists) arrives. Due to a lot-level restriction, lot X can only run on tool 1. This sort of restriction is particularly common in photolithography where running consecutive photo layers through the same tool (even if there are multiple tools qualified for the operation) can reduce product variability caused by idiosyncratic aspects of the lensing to a particular tool (this is sometimes known as a ‘lot-to lens’ dedication).The operators on this toolset abide by the following rule for dispatching lots:

Rule 1: If a load port and the required reticle are available, load the reticle and the lot onto the tool.

Since tool 1 has a load port available, the operator immediately loads the reticle onto the machine, and puts lot X into the load port.

Ten minutes later, lot Y arrives at the toolset, also requiring the same reticle, and with a lot-level restriction forcing it to run on tool 2. Since the reticle is already loaded on tool 1, lot Y cannot be dispatched until lot X has finished processing and the reticle has been moved from tool 1 to tool 2. Assume, for the purpose of simplicity, the reticle moves instantaneously, both lots will have finished processing in 130 minutes time (see Fig. 4).

Figure 4: Example of processing on the two machine toolset when operators follow Rule 1 for dispatching lots


Imagine, however, the operators adopted the following workflow:

Rule 2:  If a load port and the required reticle are available and the tool can begin processing immediately (i.e. Chamber A is free), load the reticle and the lot onto the tool

In this case, lot X will not be immediately loaded onto tool 1, since Chamber A is initially occupied. After only 20 minutes though, lot Y can be loaded onto tool 2, to finish processing 50 minutes later, at which the reticle can be moved and lot X can start on tool 1. Thus, after just 120 minutes (as opposed to the 130 minutes under Rule 1), both lot X and lot Y will have finished processing. Therefore, we can see that by adopting rule 2, the cycle time, and hence the throughput of the toolset can be improved.

Figure 5: Example of processing on the two machine toolset when operators follow Rule 2 for dispatching lots.


In our experience of wafer fabs, we often see workflows akin to Rule 1, wherein operators fill the load ports of photo tools as soon as they are free, thus forfeiting the opportunity to use reticles earlier on different tools. Adopting a workflow like Rule  2, however, is more difficult since it requires operators to have foreknowledge of when the tool will be ready to process a new lot, and reacting promptly to load the tool at precisely this time.  In practice, particularly when operator availability is limited, you will risk increasing wait time because you leave the tool under utilised if you fail to load a lot as soon as a machine becomes available.

Using advanced optimization to handle Goodhart's Law

Flexciton’s scheduler can help to alleviate this problem by employing advanced optimization technology. It can predict when lots will arrive at the photo toolset and which reticles they will require, and then jointly schedule the reticles and lots on the toolset to obtain an optimized schedule. The knowledge of future arrivals crucially allows us to identify cases where loading a reticle onto a machine now is suboptimal, since a lot will soon arrive at another tool that can make use of the reticle sooner or that simply has a higher priority. Thus, following a Flexciton schedule, operators can dispatch to load ports when they become available, with minimal risk of harming cycle time due to locking in reticles prematurely.

However, we still are not immune to the curse of Goodhart’s Law. The cycle time of an optimized schedule is itself only a proxy for what we actually care about: producing more high quality wafers at a low cost per wafer. Over-optimizing for cycle time may lead to a solution with so many loads and unloads that the labour cost of running fab becomes prohibitive. Or, as described in one of our previous blog posts, the solution may require moving reticles so frequently between tools that we increase the chance of a costly breakage.

To solve this, we apply a technique suggested by Andy Grove himself: we use pairing indicators. Combining indicators, where one has an effect counter to the other, avoids the trap of optimizing one at the expense of another. This is why we typically pair cycle time with the number of batches (to account for limited operator availability) or the number of reticle moves (to keep the risk of reticle damage low), thus mitigating the perils of Goodhart’s Law.

Explore more articles

View all
optimization scheduling engineer wafer fabs
Read time
 min read
Culture
The Flex Factor with... James

Meet James Adamson, one of our senior optimization engineers here at Flexciton. Many, many moons ago he was an aspirant farmer, now he’s designing and improving our scheduling algorithms.

Meet James Adamson, one of our senior optimization engineers here at Flexciton. Many, many moons ago he was an aspirant farmer, now he’s designing and improving our scheduling algorithms. 

Tell us what you do at Flexciton?

I’m an Optimization Engineer, which essentially means I focus on designing and improving our scheduling algorithms, while also implementing and maintaining them in production code. I also have a technical lead role for one of our customers, so I spend some time understanding their requirements in detail and thinking about how to expand the product or customise it to meet their individual needs.  

What does a typical day look like for you at Flexciton?

In my engineering team we kick things off with a stand-up to agree on priorities for the day and discuss any issues that need attention. My day would then typically be a mix of drinking coffee, getting stuck into writing code for some new functionality, and having design discussions with other members of the team to keep us aligned technically. 

What do you enjoy most about your role?

I would say the opportunity to combine two things: working on one of the most challenging optimisation problems out there; and the ability to actually have an impact, for example through getting my code into production or making and influencing key design decisions.

If you could give one piece of advice to someone, what would it be? 

I would maybe suggest they seek advice from better places… but no, I think it’s important to always be thinking about what it is you want, and to think several steps ahead. It’s all too easy to get stuck doing something you don’t enjoy.

If you could summarise working at Flexciton in 3 words, what would they be?

Interesting, challenging, impactful.

If you could swap jobs with anyone for a day, who would it be and why?

I used to want to be a farmer… so provided I could pick a day with decent weather then sure why not give that a go for day. I reckon it’s much harder work than the idea I used to have of chilling on a combine harvester though…

Tell us about your best memory at Flexciton?

There’s a whole bunch of memories from our team trips, most recently to Albufeira in Portugal where some people really shone with their dance moves. I will avoid naming names.

innovations in wafer fab production scheduling using optimization and heuristics
Read time
 min read
Technical
Scheduling Innovations: Academic Research and its Adoption in the Semiconductor Industry

This article focuses on innovations in scheduling: algorithms which assign lots to machines, decide in which order they should run, and ensure any required secondary resources are available.

Introduction

The first integrated circuits were invented by Texas Instruments and Fairchild Semiconductor in 1959. Today, semiconductor manufacturing is a $600 billion dollar industry and microchips are ubiquitous and impact our lives in ever increasing ways. To achieve such astonishing growth, academics and industry have had to constantly innovate, researching new production technologies. While much has been said about Moore's law and the push towards higher and higher transistor densities, the innovations made in how the billion dollar factories producing these chips are run have received less attention. This article focuses on innovations in scheduling: algorithms which assign lots to machines, decide in which order they should run, and ensure any required secondary resources (e.g. reticles) are available. These decisions can significantly impact the throughput and efficiency of wafer fabs.

Many innovative technologies in scheduling were first proposed by researchers and have, over time, been adapted in manufacturing. They include:

  • Dispatching: rule-based systems for deciding which lot to run next on a tool 
  • Optimization-based scheduling: mathematical techniques like mixed integer programming and constraint programming which can generate optimal machine assignments, sequencing, and more for entire toolsets or areas of the fab, improving fab-wide objectives like cycle-time or cost
  • Simulation: computer models of the manufacturing process which are often used to run what-if analysis, evaluate performance, and aid decision making

From dispatching to mathematical programming

Early academic research on dispatching rules dates back to the 1980s. Authors at the time already highlighted the significant impact scheduling can have on semiconductor manufacturing. They experimented with different types of dispatching rules, ranging from simple first-in-first-out (FIFO) rules to more bespoke rules focused on particular bottleneck tools. Over time, dispatching rules have evolved from fairly simple to increasingly complex. Rule-based dispatching systems quickly became the state-of-the-art in the industry and continue to be popular for several reasons: they can be intuitive and easy to implement, yet allow covering varying requirements. There are, however, also many situations in which dispatching rules may perform poorly: they have no foresight and generally look only at a single tool and therefore often struggle with load balancing between tools. They also struggle with more advanced constraints such as time constraints or auxiliary resources, e.g. reticles in photolithography. More generally, dispatching systems are a mature technology that has been pushed to its limits and is unlikely to lead to significant increases in productivity and yields.

For these reasons, focus has shifted over time to alternative technologies, especially deterministic scheduling based on mixed-integer programming or constraint programming. In the academic literature, these approaches start to increasingly show up around the 1990s. Early contributions focused on analysing the complexity of the wafer fab scheduling problem and solved the resulting optimization problem using heuristic techniques, but slowly moved towards rigorously scheduling single machines, tackling one particular aspect of the problem at a time. Due to the limited scope deterministic techniques could initially tackle, their adoption in industry lagged behind the academic discussion. 

From single machines to fab-wide scheduling

The last twenty years have seen deterministic scheduling techniques mature and schedule larger and more complex fab areas. In the academic literature, authors moved from focusing on single (batching) tools, to entire toolsets or larger areas of the fab including re-entrant flows. They also started including more and more operational constraints such as sequence-dependent setup and processing times, time constraints, or secondary resources such as reticles. In order to achieve this increase in scale and complexity, researchers have applied a large number of optimization techniques, and often combined rigorous mathematical programming methods with heuristic approaches.  Some have used general purpose meta-heuristics, such as genetic algorithms or simulated annealing, while others have developed bespoke heuristics for fab scheduling, such as the shifting bottleneck heuristic

As the size of problems optimization-based scheduling techniques could solve grew, the industry started to explore how to adopt these methods in practice. For example, in 2006, IBM announced that it had successfully used a combination of mixed-integer programming and constraint programming to schedule an area of a fab with up to 500 lot-steps and that this had led to a significant reduction in cycle time. Our own technology at Flexciton leverages mathematical optimization and smart decomposition, combined with modern cloud computing, to efficiently schedule entire fabs. One key advantage of using cloud technology is the ability to access huge amounts of computational power. It allows to break down complicated problems and deliver accurate schedules every few minutes, as well as the ability to adapt the solution strategy to the complexity at hand. Additionally, it enables responsive adjustments, as events unravel in real-time, allowing for a truly dynamic approach to scheduling.

Optimization-based scheduling’s trajectory from an academic niche to a high-impact technology has partially been accelerated by two major trends:

The process has been accompanied by considerable improvements in productivity, as scheduling is able to overcome many of the downsides of dispatching: it can look ahead in time, balance WIP across tools, and improve fab-wide objectives such as cost or cycle-time. A major advantage of scheduling is that it can both increase yields when demand is high and reduce cost when demand is low. 

When in doubt, simulate.

A discussion of scheduling in wafer fabs would not be complete without a word on simulation models. Simulation models are technically not scheduling algorithms - they require dispatching rules or deterministic scheduling inside them to decide machine assignment and sequencing. But they have been used to evaluate and compare different scheduling approaches from the very beginning. They were also quickly adopted by industry and have, for example, been used by STMicroelectronics to re-prioritise lots and by Infineon to help identify better dispatching rules. The development of highly reliable simulation models could greatly increase their use for performance evaluation and scheduling.

The future

More reliable simulation models are also important in light of recent trends in academic literature, which may provide a glimpse into the future of wafer fab scheduling. Rigid dispatching rules that need to be (re)tuned frequently may soon be replaced by deep reinforcement learning agents which learn dispatching rules that improve overall fab objectives. In some studies, such systems have been shown to perform as well as dispatching systems based on expert knowledge. If and when the industry adopts such techniques on a large scale remains to be seen. Since they require accurate simulation models as training environments, they can be extremely computationally intensive, and their adoption will largely depend on the development of faster training and simulation models. The combination of self-learning dispatching systems, and comprehensive, scalable scheduling models may well hold the key to unlocking unprecedented improvements in fab productivity. 

Flexciton aspires to be the key enabler in this transition, bringing state-of-the-art scheduling technology to the shop floor in a modern, sophisticated, and user-friendly platform unlike anything else on the market. Despite the enormous challenges that come with the scale of this endeavour, the initial results are very encouraging; cloud-based optimization solutions can indeed bring a step change to streamlining wafer fab scheduling while delivering consistent efficiency gains. 

the flex factor get to know charlotte flexciton
Read time
 min read
Culture
The Flex Factor with... Charlotte

This month on The Flex Factor, we get to know our Senior People & Talent Partner, Charlotte Conway! Find out a little more about her and how she creates a supportive environment that helps our whole team to thrive.

This month on The Flex Factor, we get to know our Senior People & Talent Partner, Charlotte Conway! Find out a little more about her and how she creates a supportive environment that helps our whole team to thrive.

Tell us what you do at Flexciton?

I work across both the People and Talent function as a Senior People & Talent Partner. I help Flexciton to find, attract and recruit top talent, and am responsible for engaging, supporting and developing our employees.

What does a typical day look like for you at Flexciton?

There is no such thing as a typical day in a startup! However, my day is often split 80% on the people side and 20% on talent. I like to start my day with any admin tasks or reply to any slack messages that might have come through. I then create a to-do list for what I plan to do that day. This can be dealing with employee queries, or business partnering with managers to check in on any people related matters. During busier periods I will often be taking a hands-on approach to hiring, sourcing and speaking to candidates as well as setting up our talent processes and looking at our employer branding strategy to help us to attract the best talent. As a startup there are also lots of projects to get involved in across all of HR (e.g. performance management, L&D) so a lot of my day may involve working on improving our people and talent processes... or implementing new processes!  

What do you enjoy most about your role?

What I enjoy most about my role is getting to work closely with our people (I guess it’s in the name, ‘people partner', right?). For me, the important part of being a ‘people’ partner is creating an environment where people feel heard, supported, and empowered to bring their best selves to work. Being able to have a small part in ensuring employees have all of the above is incredibly rewarding and fulfilling.

What's a quote that you live by?

“I've learned that people will forget what you said, people will forget what you did, but people will never forget how you made them feel.”

― Maya Angelou

If you could summarise working at Flexciton in 3 words, what would they be?

Exciting, dynamic and FUN.

If you could give one piece of career advice to someone, what would it be?

Never doubt yourself or let fear of failure hold you back. It’s ok to make mistakes and take risks! It’s better to look back and never have that feeling of ‘what if’ because you were too scared to take the next step.

Tell us about your best memory at Flexciton?

There are lots! However, It’s one of the many fun Flexciton socials that comes to mind - Dabbers Bingo. What better way to celebrate with your colleagues than with some good, old fashioned competition. There was dancing, music and of course bingo. This was then followed by a late night showing of Shrek in the office, and a very patient colleague (thanks Jannik) failing miserably to teach me how to ride a bike…I blame the one too many glasses of prosecco!

Interested in working at Flexciton? Head over to our careers page to to check what vacancies we currently have available and learn a little more about us whilst you're there.