When thinking about circuit breaking, the lib does it job, it sustains your application alive even with another dependent service is down, but also give you metrics about that, so you can be alerted, and follow the incident.
Thinking about metrics itself in real production issues I think it might be something related to incident management culture. Depending on the incident the answers could be right in your metrics dashboard, or metrics could lead to a specific path on solving the problem.
Working on projects which expose business metrics as well as machine metrics (cpu/memory) I have been through different incidents in productions and while some coworkers could look for logs, others were looking for the metrics, most of the time getting the solutions from there.
Working with dashboards and alerts also helps you to prevent production incidents. You might set an alert of a threshold that means you know your feature might face a problem (setting alerts is an art of experimenting and tuning). In examples like that your team is warned before some issue really happen, having time to act and mitigate it before it happens. And of course you might set alerts with thresholds that indicates your application is down, so you are alerted before the users starting complaining about it.
In addition to application metrics another good of exposing metrics is that you can expose business metrics. So if in your team there is someone to review and look at them the team member can take some insights from it, which improve your product, architecture design...
The overhead of implementation totally worth in my option. However, the technology evolves and deferents frameworks are built and help us easing this processing. Haven't work yet with Micrometer (https://github.com/micrometer-metrics/micrometer), but heard good of them. We should always look for cost balance of effort and benefits!