ForceBalance API  1.3
Automated optimization of force fields and empirical potentials
target.py
Go to the documentation of this file.
1 """ Target base class from which all ForceBalance fitting targets are derived. """
2 from __future__ import print_function
3 
4 from builtins import str
5 from builtins import range
6 import abc
7 import os
8 import subprocess
9 import shutil
10 import numpy as np
11 import time
12 from collections import OrderedDict
13 import tarfile
14 import forcebalance
15 from forcebalance.nifty import row, col, printcool_dictionary, link_dir_contents, createWorkQueue, getWorkQueue, wq_wait1, getWQIds, wopen, warn_press_key, _exec, lp_load, LinkFile
16 from forcebalance.finite_difference import fdwrap_G, fdwrap_H, f1d2p, f12d3p, in_fd
17 from forcebalance.optimizer import Counter
18 from forcebalance.output import getLogger
19 from future.utils import with_metaclass
20 logger = getLogger(__name__)
21 
22 class Target(with_metaclass(abc.ABCMeta, forcebalance.BaseClass)):
23 
24  """
25  Base class for all fitting targets.
26 
27  In ForceBalance a Target is defined as a set of reference data
28  plus a corresponding method to simulate that data using the force field.
29 
30  The 'computable quantities' may include energies and forces where the
31  reference values come from QM calculations (energy and force matching),
32  energies from an EDA analysis (Maybe in the future, FDA?), molecular
33  properties (like polarizability, refractive indices, multipole moments
34  or vibrational frequencies), relative entropies, and bulk properties.
35  Single-molecule or bulk properties can even come from the experiment!
36 
37  The central idea in ForceBalance is that each quantity makes a
38  contribution to the overall objective function. So we can build force
39  fields that fit several quantities at once, rather than putting all of
40  our chips behind energy and force matching. In the future
41  ForceBalance may even include multiobjective optimization into the
42  optimizer.
43 
44  The optimization is done by way of minimizing an 'objective
45  function', which is comprised of squared differences between the
46  computed and reference values. These differences are not computed
47  in this file, but rather in subclasses that use Target
48  as a base class. Thus, the contents of Target itself
49  are meant to be as general as possible, because the pertinent
50  variables apply to all types of fitting targets.
51 
52  An important node: Target requires that all subclasses
53  have a method get(self,mvals,AGrad=False,AHess=False)
54  that does the following:
55 
56  Inputs:
57  mvals = The parameter vector, which modifies the force field
58  (Note to self: We include mvals with each Target because we can create
59  copies of the force field and do finite difference derivatives)
60  AGrad, AHess = Boolean switches for computing analytic gradients and Hessians
61 
62  Outputs:
63  Answer = {'X': Number, 'G': array(NP), 'H': array((NP,NP)) }
64  'X' = The objective function itself
65  'G' = The gradient, elements not computed analytically are zero
66  'H' = The Hessian, elements not computed analytically are zero
67 
68  This is the only global requirement of a Target.
69  Obviously 'get' itself is not defined here, because its
70  calculation will depend entirely on specifically which target
71  we wish to use. However, this should give us a unified framework
72  which will faciliate rapid implementation of Targets.
73 
74  Future work:
75  Robert suggested that I could enable automatic detection of which
76  parameters need to be computed by finite difference. Not a bad idea. :)
77 
78  """
79 
80  def __init__(self,options,tgt_opts,forcefield):
81  """
82  All options here are intended to be usable by every
83  conceivable type of target (in other words, only
84  add content here if it's widely applicable.)
85 
86  If we want to add attributes that are more specific
87  (i.e. a set of reference forces for force matching), they
88  are added in the subclass AbInitio that inherits from
89  Target.
90 
91  """
92  super(Target, self).__init__(options)
93  #======================================#
94  # Options that are given by the parser #
95  #======================================#
96 
97  self.set_option(options, 'root')
98 
99  self.set_option(tgt_opts, 'name')
100  if self.name in ["forcefield-remote"]:
101  logger.error("forcefield-remote is not an allowed target name (reserved)")
102  raise RuntimeError
103 
104  self.set_option(tgt_opts, 'type')
105 
106  self.set_option(tgt_opts, 'weight')
107 
108  self.set_option(tgt_opts, 'fdgrad')
109 
110  self.set_option(tgt_opts, 'fdhess')
111 
112  self.set_option(tgt_opts, 'fdhessdiag')
113 
114  self.set_option(tgt_opts, 'sleepy')
115 
116  self.set_option(None, None, 'fd1_pids', [i.upper() for i in tgt_opts['fd_ptypes']], default = [])
117  self.set_option(None, None, 'fd2_pids', [i.upper() for i in tgt_opts['fd_ptypes']], default = [])
118 
120  self.set_option(options, 'finite_difference_h', 'h')
121 
122  self.set_option(options, 'backup')
123 
124  self.set_option(tgt_opts, 'read', 'rd')
125  if self.rd is not None: self.rd = self.rd.strip("/")
126 
127  self.set_option(options, 'zerograd')
128 
129  self.set_option(tgt_opts, 'epsgrad')
130 
131  self.pgrad = list(range(forcefield.np))
132  self.OptionDict['pgrad'] = self.pgrad
133 
134  #======================================#
135  # Variables which are set here #
136  #======================================#
137 
138  if os.path.exists('targets'):
139  tgtdir = 'targets'
140  elif os.path.exists('simulations'):
141  tgtdir = 'simulations'
142  elif os.path.exists('targets.tar.bz2'):
143  logger.info("Extracting targets folder from archive.\n")
144  _exec("tar xvjf targets.tar.bz2")
145  tgtdir = 'targets'
146  elif os.path.exists('targets.tar.gz'):
147  logger.info("Extracting targets folder from archive.\n")
148  _exec("tar xvzf targets.tar.gz")
149  tgtdir = 'targets'
150  else:
151  logger.error('\x1b[91mThe targets directory is missing!\x1b[0m\nDid you finish setting up the target data?\nPlace the data in a directory called "targets" or "simulations"\n')
152  raise RuntimeError
153  self.set_option(None, None, 'tgtdir', os.path.join(tgtdir,self.name))
154 
156  if 'input_file' in options and options['input_file'] is not None:
157  self.tempbase = os.path.splitext(options['input_file'])[0]+'.tmp'
158  else:
159  self.tempbase = "temp"
160  self.tempdir = os.path.join(self.tempbase, self.name)
161 
163  self.rundir = self.tempdir
164 
165  self.FF = forcefield
166 
168  if hasattr(self, 'mol2'):
169  for fnm in self.FF.fnms:
170  if fnm.endswith('.mol2'):
171  self.mol2.append(fnm)
172 
173 
174  self.xct = 0
175 
176  self.gct = 0
177 
178  self.hct = 0
179 
180  self.read_indicate = True
181 
182  self.write_indicate = True
183 
184  self.read_objective = True
185 
186  self.write_objective = True
187 
188  if not options['continue']:
190  else:
191  if not os.path.exists(os.path.join(self.root,self.tempdir)):
192  os.makedirs(os.path.join(self.root,self.tempdir))
193 
194  self.evaluated = False
195 
196  self.goodstep = False
198  def get_X(self,mvals=None,customdir=None):
199  """Computes the objective function contribution without any parametric derivatives"""
200  Ans = self.meta_get(mvals,0,0,customdir=customdir)
201  self.xct += 1
202  if Ans['X'] != Ans['X']:
203  return {'X':1e10, 'G':np.zeros(self.FF.np), 'H':np.zeros((self.FF.np,self.FF.np))}
204  return Ans
205 
206  def read_0grads(self):
207 
208  """ Read a file from the target directory containing names of
209  parameters that don't contribute to the gradient.
210 
211  *Note* that we are checking the derivatives of the objective
212  function, and not the derivatives of the quantities that go
213  into building the objective function. However, it is the
214  quantities that we actually differentiate. Since there is a
215  simple chain rule relationship, the parameters that do/don't
216  contribute to the objective function/quantities are the same.
217 
218  However, property gradients do contribute to objective
219  function Hessian elements, so we cannot use the same mechanism
220  for excluding the calculation of property Hessians. This is
221  mostly fine since we rarely if ever calculate an explicit
222  property Hessian. """
223 
224  zero_prm = os.path.join(self.root, self.tgtdir, 'zerograd.txt')
225  # If the 'zero parameters' text file exists, then we load
226  # the parameter names from the file for exclusion.
227  pgrad0 = self.pgrad[:]
228  self.pgrad = list(range(self.FF.np))
229  if os.path.exists(zero_prm):
230  for ln, line in enumerate(open(zero_prm).readlines()):
231  pid = line.strip()
232  # If a parameter name exists in the map, then
233  # the derivative is switched off for this target.
234  if pid in self.FF.map and self.FF.map[pid] in self.pgrad:
235  self.pgrad.remove(self.FF.map[pid])
236  for i in pgrad0:
237  if i not in self.pgrad:
238  pass
239  # logger.info("Parameter %s was deactivated in %s\n" % (i, self.name))
240  for i in self.pgrad:
241  if i not in pgrad0:
242  logger.info("Parameter %s was reactivated in %s\n" % (i, self.name))
243  # Set pgrad in the OptionDict so remote targets may use it.
244  self.OptionDict['pgrad'] = self.pgrad
245 
246  def write_0grads(self, Ans):
247 
248  """ Write a file to the target directory containing names of
249  parameters that don't contribute to the gradient. """
250 
251  zero_prm = os.path.join(self.root, self.tgtdir, 'zerograd.txt')
252  if os.path.exists(zero_prm):
253  zero_pids = [i.strip() for i in open(zero_prm).readlines()]
254  else:
255  zero_pids = []
256  for i in range(self.FF.np):
257  # Check whether this parameter number has a nonzero gradient.
258  if abs(Ans['G'][i]) <= self.epsgrad:
259  # Write parameter names corresponding to this parameter number.
260  for pid in self.FF.map:
261  if self.FF.map[pid] == i and pid not in zero_pids:
262  logger.info("Adding %s to zero_pids in %s\n" % (i, self.name))
263  zero_pids.append(pid)
264  # If a parameter number has a nonzero gradient, then the parameter
265  # names associated with this parameter number are removed from the list.
266  # (Not sure if this will ever happen.)
267  if abs(Ans['G'][i]) > self.epsgrad:
268  for pid in self.FF.map:
269  if self.FF.map[pid] == i and pid in zero_pids:
270  logger.info("Removing %s from zero_pids in %s\n" % (i, self.name))
271  zero_pids.remove(pid)
272  if len(zero_pids) > 0:
273  fout = open(zero_prm, 'w')
274  for pid in zero_pids:
275  print(pid, file=fout)
276  fout.close()
277 
278  def get_G(self,mvals=None,customdir=None):
279  """Computes the objective function contribution and its gradient.
280 
281  First the low-level 'get' method is called with the analytic gradient
282  switch turned on. Then we loop through the fd1_pids and compute
283  the corresponding elements of the gradient by finite difference,
284  if the 'fdgrad' switch is turned on. Alternately we can compute
285  the gradient elements and diagonal Hessian elements at the same time
286  using central difference if 'fdhessdiag' is turned on.
287 
288  In this function we also record which parameters cause a
289  nonzero change in the objective function contribution.
290  Parameters which do not change the objective function will
291  not be differentiated in subsequent calculations. This is
292  recorded in a text file in the targets directory.
293 
294  """
295  Ans = self.meta_get(mvals,1,0,customdir=customdir)
296  for i in self.pgrad:
297  if any([j in self.FF.plist[i] for j in self.fd1_pids]) or 'ALL' in self.fd1_pids:
298  if self.fdhessdiag:
299  Ans['G'][i], Ans['H'][i,i] = f12d3p(fdwrap_G(self,mvals,i),self.h,f0 = Ans['X'])
300  elif self.fdgrad:
301  Ans['G'][i] = f1d2p(fdwrap_G(self,mvals,i),self.h,f0 = Ans['X'])
302  self.gct += 1
303  if Counter() == self.zerograd and self.zerograd >= 0:
304  self.write_0grads(Ans)
305  return Ans
306 
307  def get_H(self,mvals=None,customdir=None):
308  """Computes the objective function contribution and its gradient / Hessian.
309 
310  First the low-level 'get' method is called with the analytic gradient
311  and Hessian both turned on. Then we loop through the fd1_pids and compute
312  the corresponding elements of the gradient by finite difference,
313  if the 'fdgrad' switch is turned on.
314 
315  This is followed by looping through the fd2_pids and computing the corresponding
316  Hessian elements by finite difference. Forward finite difference is used
317  throughout for the sake of speed.
318  """
319  Ans = self.meta_get(mvals,1,1,customdir=customdir)
320  if self.fdhess:
321  for i in self.pgrad:
322  if any([j in self.FF.plist[i] for j in self.fd1_pids]) or 'ALL' in self.fd1_pids:
323  Ans['G'][i] = f1d2p(fdwrap_G(self,mvals,i),self.h,f0 = Ans['X'])
324  for i in self.pgrad:
325  if any([j in self.FF.plist[i] for j in self.fd2_pids]) or 'ALL' in self.fd2_pids:
326  FDSlice = f1d2p(fdwrap_H(self,mvals,i),self.h,f0 = Ans['G'])
327  Ans['H'][i,:] = FDSlice
328  Ans['H'][:,i] = FDSlice
329  elif self.fdhessdiag:
330  for i in self.pgrad:
331  if any([j in self.FF.plist[i] for j in self.fd2_pids]) or 'ALL' in self.fd2_pids:
332  Ans['G'][i], Ans['H'][i,i] = f12d3p(fdwrap_G(self,mvals,i),self.h, f0 = Ans['X'])
333  if Counter() == self.zerograd and self.zerograd >= 0:
334  self.write_0grads(Ans)
335  self.hct += 1
336  return Ans
337 
338  def link_from_tempdir(self,absdestdir):
339  link_dir_contents(os.path.join(self.root,self.tempdir), absdestdir)
340 
341  def refresh_temp_directory(self):
342  """ Back up the temporary directory if desired, delete it
343  and then create a new one."""
344  cwd = os.getcwd()
345  abstempdir = os.path.join(self.root,self.tempdir)
346  if self.backup:
347  bakdir = os.path.join(os.path.splitext(self.tempbase)[0]+'.bak')
348  if not os.path.exists(bakdir):
349  os.makedirs(bakdir)
350  if os.path.exists(abstempdir):
351  os.chdir(self.tempbase)
352  FileCount = 0
353  while True:
354  CandFile = os.path.join(self.root,bakdir,"%s_%i.tar.bz2" % (self.name,FileCount))
355  if os.path.exists(CandFile):
356  FileCount += 1
357  else:
358  # I could use the tarfile module here
359  logger.info("Backing up: " + self.tempdir + ' to: ' + "%s/%s_%i.tar.bz2\n" % (bakdir,self.name,FileCount))
360  subprocess.call(["tar","cjf",CandFile,self.name])
361  shutil.rmtree(self.name)
362  break
363  os.chdir(cwd)
364  # Delete the temporary directory
365  shutil.rmtree(abstempdir,ignore_errors=True)
366  # Create a new temporary directory from scratch
367  os.makedirs(abstempdir)
368  # QYD: Potential bug:
369  # this function may be skipped when continue=True, causing mol2 file missing in temp folder
370  if hasattr(self, 'mol2'):
371  for f in self.mol2:
372  if os.path.exists(os.path.join(self.root, self.tgtdir, f)):
373  LinkFile(os.path.join(self.root, self.tgtdir, f), os.path.join(abstempdir, f))
374  elif f not in self.FF.fnms:
375  logger.error("%s doesn't exist and it's not in the force field directory either" % f)
376  raise RuntimeError
377 
378 
379  @abc.abstractmethod
380  def get(self,mvals,AGrad=False,AHess=False):
381 
382  """
383 
384  Every target must be able to return a contribution
385  to the objective function - however, this must be implemented
386  in the specific subclass. See abinitio for an
387  example.
388 
389  """
390 
391  logger.error('The get method is not implemented in the Target base class\n')
392  raise NotImplementedError
393 
394  def check_files(self, there):
395 
396  """ Check this directory for the presence of readable files when the 'read' option is set. """
397 
398  there = os.path.abspath(there)
399  if all([any([i == j for j in os.listdir(there)]) for i in ["objective.p", "indicate.log"]]):
400  return True
401  return False
402 
403  def read(self,mvals,AGrad=False,AHess=False):
404 
405  """
407  Read data from disk for the initial optimization step if the
408  user has provided the directory to the "read" option.
409 
410  """
411  mvals1 = np.loadtxt('mvals.txt')
412 
413  if len(mvals) > 0 and (np.max(np.abs(mvals1 - mvals)) > 1e-3):
414  warn_press_key("mvals from mvals.txt does not match up with get! (Are you reading data from a previous run?)\nmvals(call)=%s mvals(disk)=%s" % (mvals, mvals1))
415 
416  return lp_load('objective.p')
417 
418  def absrd(self, inum=None):
419 
420  """
421  Supply the correct directory specified by user's "read" option.
422  """
423 
424  if self.evaluated:
425  logger.error("Tried to read from disk, but not allowed because this target is evaluated already\n")
426  raise RuntimeError
427  if self.rd is None:
428  logger.error("The directory for reading is not set\n")
429  raise RuntimeError
430 
431  # Current directory. Move back into here after reading data.
432  here = os.getcwd()
433  # Absolute path for the directory to read data from.
434  if os.path.isabs(self.rd):
435  abs_rd = self.rd
436  else:
437  abs_rd = os.path.join(self.root, self.rd)
438  # Check for directory existence.
439  if not os.path.exists(abs_rd):
440  logger.error("Provided path %s does not exist\n" % self.rd)
441  raise RuntimeError
442  # Figure out which directory to go into.
443  s = os.path.split(self.rd)
444  have_data = 0
445  if s[-1].startswith('iter_'):
446  # Case 1: User has provided a specific directory to read from.
447  there = abs_rd
448  if not self.check_files(there):
449  logger.error("Provided path %s does not contain remote target output\n" % self.rd)
450  raise RuntimeError
451  have_data = 1
452  elif s[-1] == self.name:
453  # Case 2: User has provided the target name.
454  iterints = [int(d.replace('iter_','')) for d in os.listdir(abs_rd) if os.path.isdir(os.path.join(abs_rd, d))]
455  for i in sorted(iterints)[::-1]:
456  there = os.path.join(abs_rd, 'iter_%04i' % i)
457  if self.check_files(there):
458  have_data = 1
459  break
460  else:
461  # Case 3: User has provided something else (must contain the target name in the next directory down.)
462  if not os.path.exists(os.path.join(abs_rd, self.name)):
463  logger.error("Target directory %s does not exist in %s\n" % (self.name, self.rd))
464  raise RuntimeError
465  iterints = [int(d.replace('iter_','')) for d in os.listdir(os.path.join(abs_rd, self.name)) if os.path.isdir(os.path.join(abs_rd, self.name, d))]
466  for i in sorted(iterints)[::-1]:
467  there = os.path.join(abs_rd, self.name, 'iter_%04i' % i)
468  if self.check_files(there):
469  have_data = 1
470  break
471  if not have_data:
472  logger.error("Did not find data to read in %s\n" % self.rd)
473  raise RuntimeError
474 
475  if inum is not None:
476  there = os.path.join(os.path.split(there)[0],'iter_%04i' % inum)
477  return there
478 
479  def maxrd(self):
480 
481  """ Supply the latest existing temp-directory containing valid data. """
482 
483  abs_rd = os.path.join(self.root, self.tempdir)
484 
485  iterints = [int(d.replace('iter_','')) for d in os.listdir(abs_rd) if os.path.isdir(os.path.join(abs_rd, d))]
486  for i in sorted(iterints)[::-1]:
487  there = os.path.join(abs_rd, 'iter_%04i' % i)
488  if self.check_files(there):
489  return i
490 
491  return -1
492 
493  def maxid(self):
495  """ Supply the latest existing temp-directory. """
496 
497  abs_rd = os.path.join(self.root, self.tempdir)
498 
499  iterints = [int(d.replace('iter_','')) for d in os.listdir(abs_rd) if os.path.isdir(os.path.join(abs_rd, d))]
500  return sorted(iterints)[-1]
501 
502  def meta_indicate(self, customdir=None):
503 
504  """
505 
506  Wrap around the indicate function, so it can print to screen and
507  also to a file. If reading from checkpoint file, don't call
508  the indicate() function, instead just print the file contents
509  to the screen.
510 
511  """
512  # Using the module level logger
513  logger = getLogger(__name__)
514  # Note that reading information is not supported for custom folders (e.g. microiterations during search)
515  if self.rd is not None and (not self.evaluated) and self.read_indicate and customdir is None:
516  # Move into the directory for reading data,
517  cwd = os.getcwd()
518  os.chdir(self.absrd())
519  logger.info(open('indicate.log').read())
520  os.chdir(cwd)
521  else:
522  if self.write_indicate:
523  # Go into the directory where the job is running
524  cwd = os.getcwd()
525  os.chdir(os.path.join(self.root, self.rundir))
526  # If indicate.log already exists then we've made some kind of mistake.
527  if os.path.exists('indicate.log'):
528  logger.error('indicate.log should not exist yet in this directory: %s\n' % os.getcwd())
529  raise RuntimeError
530  # Add a handler for printing to screen and file
531  logger = getLogger("forcebalance")
532  hdlr = forcebalance.output.RawFileHandler('indicate.log')
533  logger.addHandler(hdlr)
534  # Execute the indicate function
535  self.indicate()
536  if self.write_indicate:
537  # Remove the handler (return to normal printout)
538  logger.removeHandler(hdlr)
539  # Return to the module level logger
540  logger = getLogger(__name__)
541  # The module level logger now prints the indicator
542  logger.info(open('indicate.log').read())
543  # Go back to the directory where we were
544  os.chdir(cwd)
545 
546  def meta_get(self, mvals, AGrad=False, AHess=False, customdir=None):
547  """
548  Wrapper around the get function.
549  Create the directory for the target, and then calls 'get'.
550  If we are reading existing data, go into the appropriate read directory and call read() instead.
551  The 'get' method should not worry about the directory that it's running in.
552 
553  """
554 
557  cwd = os.getcwd()
558 
559  absgetdir = os.path.join(self.root,self.tempdir)
560  if Counter() is not None:
561  # Not expecting more than ten thousand iterations
562  if Counter() > 10000:
563  logger.error('Cannot handle more than 10000 iterations due to current directory structure. Consider revising code.\n')
564  raise RuntimeError
565  iterdir = "iter_%04i" % Counter()
566  absgetdir = os.path.join(absgetdir,iterdir)
567  if customdir is not None:
568  absgetdir = os.path.join(absgetdir,customdir)
569 
570  if not os.path.exists(absgetdir):
571  os.makedirs(absgetdir)
572  os.chdir(absgetdir)
573  self.link_from_tempdir(absgetdir)
574  self.rundir = absgetdir.replace(self.root+'/','')
575 
577  if self.rd is not None and (not self.evaluated) and self.read_objective and customdir is None:
578  os.chdir(self.absrd())
579  logger.info("Reading objective function information from %s\n" % os.getcwd())
580  Answer = self.read(mvals, AGrad, AHess)
581  os.chdir(absgetdir)
582  else:
583 
584  Answer = self.get(mvals, AGrad, AHess)
585  if self.write_objective:
586  forcebalance.nifty.lp_dump(Answer, 'objective.p')
587 
588 
591  if not in_fd():
592  self.FF.make(mvals)
593 
594  os.chdir(cwd)
595 
596  return Answer
597 
598  def submit_jobs(self, mvals, AGrad=False, AHess=False):
599  return
600 
601  def stage(self, mvals, AGrad=False, AHess=False, customdir=None, firstIteration=False):
602  """
603 
604  Stages the directory for the target, and then launches Work Queue processes if any.
605  The 'get' method should not worry about the directory that it's running in.
606 
607  """
608  if self.sleepy > 0:
609  logger.info("Sleeping for %i seconds as directed...\n" % self.sleepy)
610  time.sleep(self.sleepy)
611 
614  cwd = os.getcwd()
616  absgetdir = os.path.join(self.root,self.tempdir)
617  if Counter() is not None:
618 
619  iterdir = "iter_%04i" % Counter()
620  absgetdir = os.path.join(absgetdir,iterdir)
621  if customdir is not None:
622  absgetdir = os.path.join(absgetdir,customdir)
623 
624  if not os.path.exists(absgetdir):
625  os.makedirs(absgetdir)
626  os.chdir(absgetdir)
627  self.link_from_tempdir(absgetdir)
628 
629  if not in_fd():
630  np.savetxt('mvals.txt', mvals)
631 
632  if Counter() >= self.zerograd and self.zerograd >= 0:
633  self.read_0grads()
634  self.rundir = absgetdir.replace(self.root+'/','')
635 
636  if self.rd is None or (not firstIteration):
637  self.submit_jobs(mvals, AGrad, AHess)
638  elif customdir is not None:
639  # Allows us to submit micro-iteration jobs for remote targets
640  self.submit_jobs(mvals, AGrad, AHess)
641  os.chdir(cwd)
642 
643  return
644 
645  def wq_complete(self):
646  """ This method determines whether the Work Queue tasks for the current target have completed. """
647  wq = getWorkQueue()
648  WQIds = getWQIds()
649  if wq is None:
650  return True
651  elif wq.empty():
652  WQIds[self.name] = []
653  return True
654  elif len(WQIds[self.name]) == 0:
655  return True
656  else:
657  wq_wait1(wq, wait_time=30)
658  if len(WQIds[self.name]) == 0:
659  return True
660  else:
661  return False
662 
663  def printcool_table(self, data=OrderedDict([]), headings=[], banner=None, footnote=None, color=0):
664  """ Print target information in an organized table format. Implemented 6/30 because
665  multiple targets are already printing out tabulated information in very similar ways.
666  This method is a simple wrapper around printcool_dictionary.
667 
668  The input should be something like:
669 
670  @param data Column contents in the form of an OrderedDict, with string keys and list vals.
671  The key is printed in the leftmost column and the vals are printed in the other columns.
672  If non-strings are passed, they will be converted to strings (not recommended).
673 
674  @param headings Column headings in the form of a list. It must be equal to the number to the list length
675  for each of the "vals" in OrderedDict, plus one. Use "\n" characters to specify long
676  column names that may take up more than one line.
677 
678  @param banner Optional heading line, which will be printed at the top in the title.
679  @param footnote Optional footnote line, which will be printed at the bottom.
680 
681  """
682  tline="Target: %s Type: %s Objective = %.5e" % (self.name, self.__class__.__name__, self.objective)
683  nc = len(headings)
684  if banner is not None:
685  tlines = [banner, tline]
686  else:
687  tlines = [tline]
688  # Sanity check.
689  for val in data.values():
690  if (len(val)+1) != nc:
691  logger.error('There are %i column headings, so the values in the data dictionary must be lists of length %i (currently %i)\n' % (nc, nc-1, len(val)))
692  raise RuntimeError
693  cwidths = [0 for i in range(nc)]
694  # Figure out maximum column width.
695  # First look at all of the column headings...
696  crows = []
697  for cnum, cname in enumerate(headings):
698  crows.append(len(cname.split('\n')))
699  for l in cname.split('\n'):
700  cwidths[cnum] = max(cwidths[cnum], len(l))
701  # Then look at the row names to stretch out the first column width...
702  for k in data.keys():
703  cwidths[0] = max(cwidths[0], len(str(k)))
704  # Then look at the data values to stretch out the other column widths.
705  for v in data.values():
706  for n, f in enumerate(v):
707  cwidths[n+1] = max(cwidths[n+1], len(str(f)))
708  for i in range(1, len(cwidths)):
709  cwidths[i] += 2
710  if cwidths[0] < 15:
711  cwidths[0] = 15
712  cblocks = [['' for i in range(max(crows) - len(cname.split('\n')))] + cname.split('\n') for cnum, cname in enumerate(headings)]
713  # The formatting line consisting of variable column widths
714  fline = ' '.join("%%%s%is" % (("-" if i==0 else ""), j) for i, j in enumerate(cwidths))
715  vline = ' '.join(["%%%is" % j for i, j in enumerate(cwidths) if i > 0])
716  clines = [fline % (tuple(cblocks[j][i] for j in range(nc))) for i in range(max(crows))]
717  tlines += clines
718  PrintDict = OrderedDict([(key, vline % (tuple(val))) for key, val in data.items()])
719  if len(clines[0]) > len(tlines[0]):
720  centers = [0, 1]
721  else:
722  centers = [0]
723  printcool_dictionary(PrintDict, title='\n'.join(tlines), keywidth=cwidths[0], center=[i in centers for i in range(len(tlines))], leftpad=4, color=color)
724 
725  def serialize_ff(self, mvals, outside=None):
726  """
727  This code writes a force field pickle file to an folder in
728  "job.tmp/dnm/forcebalance.p", because it takes
729  time to compress and most targets can simply reuse this file.
730 
731  Inputs:
732  mvals = Mathematical parameter values
733  outside = Write this file outside the targets directory
734  """
735  cwd = os.getcwd()
736  if outside is not None:
737  self.ffpd = cwd.replace(os.path.join(self.root, self.tempdir), os.path.join(self.root, self.tempbase, outside))
738  else:
739  self.ffpd = os.path.abspath(os.path.join(self.root, self.rundir))
740  if not os.path.exists(self.ffpd): os.makedirs(self.ffpd)
741  os.chdir(self.ffpd)
742  makeffp = False
743  if (os.path.exists("mvals.txt") and os.path.exists("forcefield.p")):
744  mvalsf = np.loadtxt("mvals.txt")
745  if len(mvalsf) > 0 and np.max(np.abs(mvals - mvalsf)) != 0.0:
746  makeffp = True
747  else:
748  makeffp = True
749  if makeffp:
750  # logger.info("Writing force field to: %s\n" % self.ffpd)
751  self.FF.make(mvals)
752  np.savetxt("mvals.txt", mvals)
753  forcebalance.nifty.lp_dump((self.FF, mvals), 'forcefield.p')
754  os.chdir(cwd)
755  forcebalance.nifty.LinkFile(os.path.join(self.ffpd, 'forcefield.p'), 'forcefield.p')
756 
757 class RemoteTarget(Target):
758  def __init__(self,options,tgt_opts,forcefield):
759  super(RemoteTarget, self).__init__(options,tgt_opts,forcefield)
760 
761  self.r_options = options.copy()
762  self.r_options["type"]="single"
763  self.set_option(tgt_opts, "remote_prefix", "rpfx")
764  self.set_option(tgt_opts, "remote_backup", "rbak")
765 
766  self.r_tgt_opts = tgt_opts.copy()
767  self.r_tgt_opts["remote"]=False
768 
769  tar = tarfile.open(name="%s/target.tar.bz2" % (self.tempdir), mode='w:bz2', dereference=True)
770  tar.add("%s/targets/%s" % (self.root, self.name), arcname = "targets/%s" % self.name)
771  tar.close()
772 
773  self.remote_indicate = ""
774 
775  if options['wq_port'] == 0:
776  logger.error("Please set the Work Queue port to use Remote Targets.\n")
777  raise RuntimeError
779  # Remote target will read objective.p and indicate.log at the same time,
780  # and it uses a different mechanism because it does this at every iteration (not just the 0th).
781  self.read_indicate = False
782  self.write_indicate = False
783  self.write_objective = False
784 
785  def submit_jobs(self, mvals, AGrad=False, AHess=False):
786 
787  id_string = "%s_iter%04i" % (self.name, Counter())
788 
789  self.serialize_ff(mvals, outside="forcefield-remote")
790  forcebalance.nifty.lp_dump((AGrad, AHess, id_string, self.r_options, self.r_tgt_opts, self.pgrad),'options.p')
791 
792  # Link in the rpfx script.
793  if len(self.rpfx) > 0:
794  forcebalance.nifty.LinkFile(os.path.join(os.path.split(__file__)[0],"data",self.rpfx),self.rpfx)
795  forcebalance.nifty.LinkFile(os.path.join(os.path.split(__file__)[0],"data","rtarget.py"),"rtarget.py")
796  forcebalance.nifty.LinkFile(os.path.join(self.root, self.tempdir, "target.tar.bz2"),"target.tar.bz2")
797 
798  wq = getWorkQueue()
799 
800  # logger.info("Sending target '%s' to work queue for remote evaluation\n" % self.name)
801  # input:
802  # forcefield.p: pickled force field
803  # options.p: pickled mvals, options
804  # rtarget.py: remote target evaluation script
805  # target.tar.bz2: tarred target
806  # output:
807  # objective.p: pickled objective function dictionary
808  # indicate.log: results of target.indicate() written to file
809  # if len(self.rpfx) > 0 and self.rpfx not in ['rungmx.sh', 'runcuda.sh']:
810  # logger.error('Unsupported prefix script for launching remote target')
811  # raise RuntimeError
812  forcebalance.nifty.queue_up(wq, "%spython rtarget.py > rtarget.out 2>&1" % (("sh %s%s " % (self.rpfx, " -b" if self.rbak else ""))
813  if len(self.rpfx) > 0 else ""),
814  ["forcefield.p", "options.p", "rtarget.py", "target.tar.bz2"] + ([self.rpfx] if len(self.rpfx) > 0 else []),
815  ['target_result.tar.bz2'],
816  tgt=self, tag=self.name, verbose=False)
817 
818  def read(self,mvals,AGrad=False,AHess=False):
819  return self.get(mvals, AGrad, AHess)
820 
821  def get(self,mvals,AGrad=False,AHess=False):
822  with tarfile.open("target_result.tar.bz2", "r") as tar:
823  tar.extractall()
824  with open('indicate.log', 'r') as f:
825  self.remote_indicate = f.read()
826  return lp_load('objective.p')
827 
828  def indicate(self):
829  logger.info(self.remote_indicate)
830 
831 
tempbase
Relative directory of target.
Definition: target.py:160
def get_X(self, mvals=None, customdir=None)
Computes the objective function contribution without any parametric derivatives.
Definition: target.py:203
Base class for all fitting targets.
Definition: target.py:79
def lp_load(fnm)
Read an object from a bzipped file specified by the path.
Definition: nifty.py:836
hct
Counts how often the Hessian was computed.
Definition: target.py:181
Nifty functions, intended to be imported by any module within ForceBalance.
def fdwrap_G(tgt, mvals0, pidx)
A driver to fdwrap for gradients (see documentation for fdwrap) Inputs: tgt = The Target containing t...
def printcool_table(self, data=OrderedDict([]), headings=[], banner=None, footnote=None, color=0)
Print target information in an organized table format.
Definition: target.py:701
def indicate(self)
Definition: target.py:849
Optimization algorithms.
def check_files(self, there)
Check this directory for the presence of readable files when the &#39;read&#39; option is set...
Definition: target.py:406
def get_G(self, mvals=None, customdir=None)
Computes the objective function contribution and its gradient.
Definition: target.py:301
def __init__(self, options, tgt_opts, forcefield)
All options here are intended to be usable by every conceivable type of target (in other words...
Definition: target.py:94
def LinkFile(src, dest, nosrcok=False)
Definition: nifty.py:1313
def submit_jobs(self, mvals, AGrad=False, AHess=False)
Definition: target.py:615
def serialize_ff(self, mvals, outside=None)
This code writes a force field pickle file to an folder in "job.tmp/dnm/forcebalance.p", because it takes time to compress and most targets can simply reuse this file.
Definition: target.py:755
gct
Counts how often the gradient was computed.
Definition: target.py:179
def read(self, mvals, AGrad=False, AHess=False)
Read data from disk for the initial optimization step if the user has provided the directory to the "...
Definition: target.py:421
write_objective
Whether to write objective.p at every iteration (true for all but remote.)
Definition: target.py:189
write_indicate
Whether to write indicate.log at every iteration (true for all but remote.)
Definition: target.py:185
def in_fd()
Invoking this function from anywhere will tell us whether we&#39;re being called by a finite-difference f...
def meta_get(self, mvals, AGrad=False, AHess=False, customdir=None)
Wrapper around the get function.
Definition: target.py:570
pgrad
Iteration where we turn on zero-gradient skipping.
Definition: target.py:134
def absrd(self, inum=None)
Supply the correct directory specified by user&#39;s "read" option.
Definition: target.py:434
rd
Root directory of the whole project.
Definition: target.py:128
def maxrd(self)
Supply the latest existing temp-directory containing valid data.
Definition: target.py:494
rundir
self.tempdir = os.path.join(&#39;temp&#39;,self.name) The directory in which the simulation is running - this...
Definition: target.py:166
def submit_jobs(self, mvals, AGrad=False, AHess=False)
Definition: target.py:806
def link_from_tempdir(self, absdestdir)
Definition: target.py:346
FF
Need the forcefield (here for now)
Definition: target.py:168
goodstep
This flag specifies whether the previous optimization step was good.
Definition: target.py:199
def refresh_temp_directory(self)
Back up the temporary directory if desired, delete it and then create a new one.
Definition: target.py:352
def read(self, mvals, AGrad=False, AHess=False)
Definition: target.py:839
def link_dir_contents(abssrcdir, absdestdir)
Definition: nifty.py:1345
def stage(self, mvals, AGrad=False, AHess=False, customdir=None, firstIteration=False)
Stages the directory for the target, and then launches Work Queue processes if any.
Definition: target.py:625
def getWQIds()
Definition: nifty.py:908
def write_0grads(self, Ans)
Write a file to the target directory containing names of parameters that don&#39;t contribute to the grad...
Definition: target.py:254
def get_H(self, mvals=None, customdir=None)
Computes the objective function contribution and its gradient / Hessian.
Definition: target.py:326
def warn_press_key(warning, timeout=10)
Definition: nifty.py:1599
def printcool_dictionary(Dict, title="Dictionary Keys : Values", bold=False, color=2, keywidth=25, topwidth=50, center=True, leftpad=0)
See documentation for printcool; this is a nice way to print out keys/values in a dictionary...
Definition: nifty.py:366
def maxid(self)
Supply the latest existing temp-directory.
Definition: target.py:509
def wq_wait1(wq, wait_time=10, wait_intvl=1, print_time=60, verbose=False)
This function waits ten seconds to see if a task in the Work Queue has finished.
Definition: nifty.py:996
read_objective
Whether to read objective.p from file when restarting an aborted run.
Definition: target.py:187
evaluated
Create a new temp directory.
Definition: target.py:197
xct
mol2 files that are stored in the forcefield folder need to be included in the list of mol2 files for...
Definition: target.py:177
def get(self, mvals, AGrad=False, AHess=False)
Definition: target.py:842
def meta_indicate(self, customdir=None)
Wrap around the indicate function, so it can print to screen and also to a file.
Definition: target.py:526
read_indicate
Whether to read indicate.log from file when restarting an aborted run.
Definition: target.py:183
def fdwrap_H(tgt, mvals0, pidx)
A driver to fdwrap for Hessians (see documentation for fdwrap) Inputs: tgt = The Target containing th...
def get(self, mvals, AGrad=False, AHess=False)
Every target must be able to return a contribution to the objective function - however, this must be implemented in the specific subclass.
Definition: target.py:398
def f12d3p(f, h, f0=None)
A three-point finite difference stencil.
def wq_complete(self)
This method determines whether the Work Queue tasks for the current target have completed.
Definition: target.py:665
def Counter()
Definition: optimizer.py:35
def getWorkQueue()
Definition: nifty.py:904
def f1d2p(f, h, f0=None)
A two-point finite difference stencil.
def read_0grads(self)
Read a file from the target directory containing names of parameters that don&#39;t contribute to the gra...
Definition: target.py:226