livinginthepast / smf

A Lightweight Resource Provider for SMF in Chef

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Default dependency to svc:/milestone/sysconfig incorrect on Solaris 11

cruschke opened this issue · comments

The documentations is saying that these are the default dependencies

  • svc:/milestone/sysconfig
  • svc:/system/filesystem/local
  • svc:/milestone/name-services
  • svc:/milestone/network

but svc:/milestone/sysconfig does not exist anymore on Solaris 11.

Solaris 10

mysolaris10 :~\> svcs svc:/milestone/sysconfig svc:/system/filesystem/local svc:/milestone/name-services svc:/milestone/network
STATE          STIME    FMRI
online         Oct_22   svc:/milestone/network:default
online         Oct_22   svc:/system/filesystem/local:default
online         Oct_22   svc:/milestone/sysconfig:default
online         Oct_22   svc:/milestone/name-services:default

Solaris 11

mysolaris11 :~\> svcs svc:/milestone/sysconfig svc:/system/filesystem/local svc:/milestone/name-services svc:/milestone/network
svcs: Pattern 'svc:/milestone/sysconfig' doesn't match any instances
STATE          STIME    FMRI
online         Jun_06   svc:/milestone/network:default
online         Jun_06   svc:/system/filesystem/local:default
online         Jun_06   svc:/milestone/name-services:default

But on Solaris 11 there is a svc:/milestone/config:default instead that sounds to be the replacement for svc:/milestone/sysconfig.

Honestly, I think the default dependencies need to be reworked anyways. I
should be able to push a fix later today, but would welcome some thoughts
on a better way to organize default deps.

/cc @bixu

Sent from my iPhone

On Jun 10, 2014, at 3:40 AM, Christian Ruschke notifications@github.com
wrote:

The documentations is saying that these are the default dependencies

  • svc:/milestone/sysconfig
  • svc:/system/filesystem/local
  • svc:/milestone/name-services
  • svc:/milestone/network

but svc:/milestone/sysconfig does not exist anymore on Solaris 11.

Solaris 10

mysolaris10 :~> svcs svc:/milestone/sysconfig
svc:/system/filesystem/local svc:/milestone/name-services
svc:/milestone/network
STATE STIME FMRI
online Oct_22 svc:/milestone/network:default
online Oct_22 svc:/system/filesystem/local:default
online Oct_22 svc:/milestone/sysconfig:default
online Oct_22 svc:/milestone/name-services:default

Solaris 11

mysolaris11 :~> svcs svc:/milestone/sysconfig
svc:/system/filesystem/local svc:/milestone/name-services
svc:/milestone/network
svcs: Pattern 'svc:/milestone/sysconfig' doesn't match any instances
STATE STIME FMRI
online Jun_06 svc:/milestone/network:default
online Jun_06 svc:/system/filesystem/local:default
online Jun_06 svc:/milestone/name-services:default

But on Solaris 11 there is a svc:/milestone/config:default instead that
sounds to be the replacement for svc:/milestone/sysconfig.


Reply to this email directly or view it on GitHub
#15.

@cruschke I think I'm going to hack this fix together in the short term. I don't have a Solaris 11 machine to test, however. Could you let me know what node.platform, node.platform_family and node.platform_version are?

Looking at Ohai, my worry is that it will show up as solaris2, which is indistinguishable from Solaris 10. How do I make sure this doesn't break older versions?

Seems like we should try to get "platform_family" to report either "illumos" or "solaris", while "platform" should return "solaris10", "solaris11", "smartos", ect...but I may be misunderstanding the intended use of these automatic attributes?

This is from ohai on my Solaris 11:

"platform": "solaris2",
"platform_family": "solaris2",
"platform_version": "5.11",

In general for platforms that you dont have to test, you could have a look at Fauxhai for what SmartOS would say.
But then also OmniOS should be considered to be supported.

I think using only_if (or not_if) with "os_version": "5.11" node attributes should properly distinguish between a Solaris10 and Solaris11 and their forks (SmartOS/OmniOS). platform_version would not work on OmniOS as they put in their build number into that value.

@cruschke sorry for the delay on a fix, but I just want to ensure that I don't break solaris 10 compatibility. Do you happen to have a Solaris 10 machine available to verify that platform_version is reported at 5.10?

Sure I do. This is Ohai output on Solaris 10:

"platform_version": "5.10",
"platform": "solaris2"

@cruschke ok, finally pushed a new version to github and to community.opscode.com. Can you test to ensure that it works out?

Thanks for the looking at this issue, however I believe its not entirely fixed ...

Here is my test scenario:

uname from my Vagrant box

vagrant@live-solaris-11-1:~$ uname -a
SunOS live-solaris-11-1.vagrantup.com 5.11 11.1 i86pc i386 i86pc

This is the debug output of Chef run ...

  * smf[splunk] action install       [2014-06-20T01:07:13+10:00] INFO: Processing smf[splunk] action install (myrecipe::_splunkforwarder_service_smf line 23)
       STATE          STIME    FMRI
       STATE          STIME    FMRI
       [2014-06-20T01:07:14+10:00] DEBUG: Loaded checksum for SMF splunk:
       [2014-06-20T01:07:14+10:00] DEBUG: SMF service already exists for /application/management/splunk? false
       [2014-06-20T01:07:14+10:00] DEBUG: Creating manifest directory at /var/svc/manifest/application
       [2014-06-20T01:07:14+10:00] DEBUG: Writing SMF manifest for splunk
       [2014-06-20T01:07:14+10:00] DEBUG: importing SMF manifest /var/svc/manifest/application/splunk.xml
       1
       [2014-06-20T01:07:15+10:00] DEBUG: Saving checksum for SMF splunk: c2c2d8e7a3bc0d93c326adf72a47a00c

Here is the generated SMF

vagrant@live-solaris-11-1:~$ svccfg export splunk
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
  <service name='application/management/splunk' type='service' version='0'>
    <create_default_instance enabled='true'/>
    <single_instance/>
    <dependency name='milestone' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/milestone/sysconfig'/>
    </dependency>
    <dependency name='fs-local' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/filesystem/local'/>
    </dependency>
    <dependency name='name-services' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/milestone/name-services'/>
    </dependency>
    <dependency name='network' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/milestone/network'/>
    </dependency>
    <method_context>
      <method_credential privileges='basic,net_privaddr' user='vagrant'/>
    </method_context>
    <exec_method name='start' type='method' exec='/opt/splunkforwarder/bin/splunk start' timeout_seconds='60'/>
    <exec_method name='stop' type='method' exec='/opt/splunkforwarder/bin/splunk stop' timeout_seconds='60'/>
    <exec_method name='restart' type='method' exec='/opt/splunkforwarder/bin/splunk restart' timeout_seconds='60'/>
    <property_group name='general' type='framework'>
      <propval name='action_authorization' type='astring' value='solaris.smf.manage.splunk'/>
      <propval name='value_authorization' type='astring' value='solaris.smf.value.splunk'/>
    </property_group>
    <stability value='Evolving'/>
    <template>
      <common_name>
        <loctext xml:lang='C'>splunk</loctext>
      </common_name>
    </template>
  </service>
</service_bundle>

As you can see its still depending on milestone sysconfig although it should be config instead.

Oh... I think I know why this is happening, but I'll need to look a bit to
find the right solution.

If I'm correct, new files written will be correct. The problem will only be
updating existing files. You could even delete the checksum file for this
service in /var/chef/checksums to force it to rewrite the file.

Sent from my iPhone

On Jun 19, 2014, at 8:44 AM, Christian Ruschke notifications@github.com
wrote:

Thanks for the looking at this issue, however I believe its not entirely
fixed ...

Here is my test scenario:

uname from my Vagrant box

vagrant@live-solaris-11-1:~$ uname -a
SunOS live-solaris-11-1.vagrantup.com 5.11 11.1 i86pc i386 i86pc

This is the debug output of Chef run ...

  • smf[splunk] action install [2014-06-20T01:07:13+10:00] INFO:
    Processing smf[splunk] action install
    (myrecipe::_splunkforwarder_service_smf line 23)
    STATE STIME FMRI
    STATE STIME FMRI
    [2014-06-20T01:07:14+10:00] DEBUG: Loaded checksum for SMF splunk:
    [2014-06-20T01:07:14+10:00] DEBUG: SMF service already exists
    for /application/management/splunk? false
    [2014-06-20T01:07:14+10:00] DEBUG: Creating manifest directory
    at /var/svc/manifest/application
    [2014-06-20T01:07:14+10:00] DEBUG: Writing SMF manifest for splunk
    [2014-06-20T01:07:14+10:00] DEBUG: importing SMF manifest
    /var/svc/manifest/application/splunk.xml
    1
    [2014-06-20T01:07:15+10:00] DEBUG: Saving checksum for SMF
    splunk: c2c2d8e7a3bc0d93c326adf72a47a00c

Here is the generated SMF

vagrant@live-solaris-11-1:~$ svccfg export splunk

<service_bundle type='manifest' name='export'>

<create_default_instance enabled='true'/>
<single_instance/>

<service_fmri value='svc:/milestone/sysconfig'/>


<service_fmri value='svc:/system/filesystem/local'/>


<service_fmri value='svc:/milestone/name-services'/>


<service_fmri value='svc:/milestone/network'/>

<method_context>
<method_credential privileges='basic,net_privaddr' user='vagrant'/>
</method_context>
<exec_method name='start' type='method'
exec='/opt/splunkforwarder/bin/splunk start' timeout_seconds='60'/>
<exec_method name='stop' type='method'
exec='/opt/splunkforwarder/bin/splunk stop' timeout_seconds='60'/>
<exec_method name='restart' type='method'
exec='/opt/splunkforwarder/bin/splunk restart' timeout_seconds='60'/>
<property_group name='general' type='framework'>


</property_group>


<common_name>
splunk
</common_name>


</service_bundle>

As you can see its still depending on milestone sysconfig although it
should be config instead.


Reply to this email directly or view it on GitHub
#15 (comment).

The test scenario I am using includes rebuilding the entire VM from scratch (using Testkitchen), so guaranteed there is no leftover SMF.

Recipe: bild_monitoring::_splunkforwarder_service_smf
  * smf[splunk] action install       [2014-06-20T16:51:35+10:00] INFO: Processing smf[splunk] action install (bild_monitoring::_splunkforwarder_service_smf line 23)
       STATE          STIME    FMRI
       STATE          STIME    FMRI
       [2014-06-20T16:51:35+10:00] DEBUG: Loaded checksum for SMF splunk:
       [2014-06-20T16:51:35+10:00] DEBUG: SMF service already exists for /application/management/splunk? false
       [2014-06-20T16:51:35+10:00] DEBUG: Creating manifest directory at /var/svc/manifest/application
       [2014-06-20T16:51:35+10:00] DEBUG: Writing SMF manifest for splunk
       [2014-06-20T16:51:35+10:00] DEBUG: importing SMF manifest /var/svc/manifest/application/splunk.xml
       1
       [2014-06-20T16:51:36+10:00] DEBUG: Saving checksum for SMF splunk: c2c2d8e7a3bc0d93c326adf72a47a00c



       [2014-06-20T16:51:36+10:00] INFO: smf[splunk] sending enable action to service[splunk] (immediate)
Recipe: bild_monitoring::default
  * service[splunk] action enable       [2014-06-20T16:51:36+10:00] INFO: Processing service[splunk] action enable (bild_monitoring::default line 29)
       fmri         svc:/application/management/splunk:default
       name         splunk
       enabled      false
       state        disabled
       next_state   none
       state_time   Fri Jun 20 16:51:37 2014
       restarter    svc:/system/svc/restarter:default
       manifest     /var/svc/manifest/application/splunk.xml
       dependency   require_all/none svc:/milestone/sysconfig (absent)
       dependency   require_all/none svc:/system/filesystem/local (online)
       dependency   require_all/none svc:/milestone/name-services (online)
       dependency   require_all/none svc:/milestone/network (online)

Ok, I'm going to do what I should have done from the start and try to get
test-kitchen set up with a test suite.

I see that there's a solaris 11 vagrant box available here:
https://vagrantcloud.com/ruby-concurrency/oracle-solaris-11. Hopefully
between it and the available OmniOS boxes I can get this sorted out.

On Thu, Jun 19, 2014 at 11:58 PM, Christian Ruschke <
notifications@github.com> wrote:

The test scenario I am using includes rebuilding the entire VM from
scratch (using Testkitchen), so guaranteed there is no leftover SMF.

Recipe: bild_monitoring::_splunkforwarder_service_smf

  • smf[splunk] action install [2014-06-20T16:51:35+10:00] INFO: Processing smf[splunk] action install (bild_monitoring::_splunkforwarder_service_smf line 23)
    STATE STIME FMRI
    STATE STIME FMRI
    [2014-06-20T16:51:35+10:00] DEBUG: Loaded checksum for SMF splunk:
    [2014-06-20T16:51:35+10:00] DEBUG: SMF service already exists for /application/management/splunk? false
    [2014-06-20T16:51:35+10:00] DEBUG: Creating manifest directory at /var/svc/manifest/application
    [2014-06-20T16:51:35+10:00] DEBUG: Writing SMF manifest for splunk
    [2014-06-20T16:51:35+10:00] DEBUG: importing SMF manifest /var/svc/manifest/application/splunk.xml
    1
    [2014-06-20T16:51:36+10:00] DEBUG: Saving checksum for SMF splunk: c2c2d8e7a3bc0d93c326adf72a47a00c

    [2014-06-20T16:51:36+10:00] INFO: smf[splunk] sending enable action to servicesplunk
    Recipe: bild_monitoring::default

  • service[splunk] action enable [2014-06-20T16:51:36+10:00] INFO: Processing service[splunk] action enable (bild_monitoring::default line 29)
    fmri svc:/application/management/splunk:default
    name splunk
    enabled false
    state disabled
    next_state none
    state_time Fri Jun 20 16:51:37 2014
    restarter svc:/system/svc/restarter:default
    manifest /var/svc/manifest/application/splunk.xml
    dependency require_all/none svc:/milestone/sysconfig (absent)
    dependency require_all/none svc:/system/filesystem/local (online)
    dependency require_all/none svc:/milestone/name-services (online)
    dependency require_all/none svc:/milestone/network (online)


Reply to this email directly or view it on GitHub
#15 (comment).


e s

@cruschke it looks like this may be an issue with ohai. Look at this commit, which appears to fix the detection of Oracle Solaris: chef/ohai@030077c. It looks like the current release of ohai is too strict in its pattern matching. I see in /etc/release on Solaris 11 that the text is "Oracle Solaris 11.1 X86", which is not going to be matched by the regex in ohai 7.0.4 or earlier.

$ ohai --version
Ohai: 7.0.4
$ ohai | grep platform
  "platform_version": "5.11",
  "platform_build": "11.1",
  "platform_family": null,

$ sudo /opt/chef/embedded/bin/gem install ohai --pre
 ....

$ ohai --version
Ohai: 7.2.0.alpha.0
$ ohai | grep platform
  "platform_version": "5.11",
  "platform_build": "11.1",
  "platform": "solaris2",
  "platform_family": "solaris2",

I definitely want to match on platform == "solaris2", so I can specifically target Oracle Solaris.

...and it was a stupid typo I made 😄. I just pushed a new version (2.0.6) of the cookbook to the community site with the correct platform matching.

I've known I had to set up test-kitchen for a while, as well as write up some chefspecs. This is a good incentive!

As a note, I had to use the latest alpha version of Chef and Ohai in order for this to work. I'm not sure if it's dependent on the specific release of Solaris 11, or maybe older versions of Chef work, as it appears from above that the versions you're using correctly match solaris2.

Thanks, now its working as expected on both Solaris 10 and 11 👍.