Magicsheet logo

Optimize Water Distribution in a Village

Hard
25%
Updated 8/1/2025

Optimize Water Distribution in a Village

What is this problem about?

The Optimize Water Distribution in a Village problem gives you n houses where each house can either build its own well (with cost) or connect to an already-watered house (via pipe with cost). Find the minimum total cost to supply water to all houses. This hard coding problem reduces to a Minimum Spanning Tree problem with a virtual node. The union find, graph, heap, and minimum spanning tree interview pattern is the core.

Why is this asked in interviews?

Apple and Google ask this because it demonstrates the transformation trick: add a virtual node 0 connected to each house with the well-building cost. Then find the MST of this augmented graph — edges in the MST represent either well construction (virtual 0 edges) or pipe connections. The standard Kruskal's or Prim's MST algorithm applies directly.

Algorithmic pattern used

Virtual node + Kruskal's MST. Add node 0. For each house i, add edge (0, i, wells[i]). Add all pipe edges (house_j, house_k, cost). Sort all edges by cost. Apply Kruskal's with Union-Find: greedily add cheapest edges that don't form a cycle until all n+1 nodes are connected. Total cost of the MST = minimum water supply cost.

Example explanation

n=3, wells=[1,2,2], pipes=[(1,2,1),(2,3,1)]. Virtual edges: (0,1,1),(0,2,2),(0,3,2). Pipe edges: (1,2,1),(2,3,1). All edges sorted: [(0,1,1),(1,2,1),(2,3,1),(0,2,2),(0,3,2)]. MST: Add (0,1,1)→cost=1. Add (1,2,1)→cost=2. Add (2,3,1)→cost=3. All connected. Minimum cost = 3.

Common mistakes candidates make

  • Trying to solve with greedy well-placement without the virtual node insight.
  • Not including virtual edges for well construction in the MST.
  • Using Dijkstra instead of Kruskal's (Dijkstra finds shortest path, not MST).
  • Not applying Union-Find for cycle detection.

Interview preparation tip

The "virtual node" trick is a powerful MST transformation. When some nodes can optionally connect to an "external source" at a cost, model the source as a virtual node with edges of the respective costs. The MST then optimally chooses between direct connections and source connections. Practice Kruskal's algorithm with Union-Find until you can implement it fluently — it's the cleanest MST solution for edge-list inputs.

Similar Questions