Kriging or Gaussian process regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by prior covariances. In this example, we show how PARyOpt can be used to generate response surfaces using available data. As with the previous examples, we shall use the standard parabola as the underlying function to be approximated. There are several ways to use PARyOpt for Kriging, one of which is shown here. This is possibly the easiest and cleanest way to perform Kriging using PARyOpt.
Data generation: Since the underlying function is known, we shall generate data by invoking this function at some random locations within the bounds and storing them in an external file. This is achieved through the following snippet:
def create_data_csv(function: Callable, filename: str, l_bound: np.array, u_bound: np.array) -> None: # generate some random locations -- 7 normalized_population = np.random.ranf((7, )) real_population = l_bound + normalized_population * (u_bound - l_bound) real_population = [np.asarray([p]) for p in real_population] # evaluate the values real_functions = [float(function(p)) for p in real_population] # write into file with open(filename, 'w') as f: writer = csv.writer(f, delimiter=',') writer.writerow('x y') for x, y in zip(real_population, real_functions): writer.writerow(list(x) + [y])
Data assimilation : PARyOpt provides an
add_point() to add external data manually. The user has to supply
the x location and any available y values to this method to add data to the
An example usage of
add_point() using the above generated data can be:
def load_from_csv(b_opt: BayesOpt, filename: str) -> BayesOpt: """ load data from csv file and add to PARyOpt """ with open(filename, 'r') as csvfile: csv_file_lines = csv.reader(csvfile, delimiter=',') for row_num, row in enumerate(csv_file_lines): if row_num == 0: # skipping the header pass else: b_opt.add_point(x=np.asarray([float(row)]), y=float(row[-1]), if_check_nearness=True) b_opt.update_surrogate() return b_opt
Note that the user has to manually invoke
update_surrogate(). This is currently for efficiency purposes and hope
to be replaced in the upcoming versions.
Finally, since the user wants to add data manually and does not want the standard initialization required for bayesian
optimization, we provide a switch
do_init to turn off the initialization. Since there is no cost function to
be optimized, the evaluator should be passed in an empty function for evaluation.
# dummy evaluators: evaluator = FunctionEvaluator(lambda x: 0.0) krig = BayesOpt(cost_function=evaluator, l_bound=l_bound, u_bound=u_bound, n_dim=1, n_init=0, do_init=False, # ensures that initialization is not done. kern_function='sqr_exp', acq_func='LCB', kappa_strategy=lambda curr_iter: 1000, if_restart=False) krig = load_from_csv(krig, data_filename) krig.estimate_best_kernel_parameters(theta_bounds=[[0.001, 10.0]])
Note that since we are not providing any actual cost function here,
update_iter() does nothing useful. In case
the user is looking for an instantaneous Kriging model, i.e., creating a Kriging surface and updating it, the actual
cost function should be provided. Just like the previous examples, one may use
evaluator from example-3_
and do Kriging similar to optimization.
Now that the surrogate is created and hyper-parameters optimized, one can start querying it using
location = np.array([1.0]) mean, variance = krig.evaluate_surrogate_at(location)